Names + whitespace

NOTE: this is part of a series about (re-)learning JavaScript.

Names + whitespace in JavaScript

Spoiler: ಠ_ಠ is a valid name.

Whitespace

The best place to start is the very beginning–a blank file.

Whitespace in JavaScript is normally unimportant, except when it’s used to separate characters that could be interpreted as something else. For instance:

var firstName = 'Niccolò';

You cannot remove the space between var and firstName as that would change the meaning of the statement, but the rest is just for readability. JavaScript would be able to use the equal sign, quotes and colon to separate and interpret the instructions correctly even if you omitted the spaces.

Indentation

To make it easier to other programmers and yourself to understand the code, you should indent it using between 2-4 spaces or the tab character to improve readability:

if (1 === 1) {
    alert('Hello, there!');
} else {
    console.log('We have a problem, here');
}

Although it comes down to personal preference whether you choose to use spaces or tabs, most people use tabs to indent JavaScript code.

Using tabs is good because the actual amount of space can be usually controlled in your editor without changing the actual code, while spaces are better for copying & pasting to/from the Web.

I personally don’t really care as editors can easily switch between the two seamlessly, letting you use the tab key and inserting your specified number of space characters automatically, etc.

Coding style

Again to improve readability, you should be mindful of how you format instructions. This is personal preference as well, just pick your favorite style and be consistent.

Personally, I think it’s pretty much wrong if you don’t put a space after if, for, and while statements, because it’s nice to differentiate them from function calls. For instance, this:

if (whatever) {
    myfunc();
}

Looks a lot better than:

if(whatever) {
    myfunc();
}

Well, I said it wasn’t important and I just wasted 30 seconds of your life you can’t get back, but that just bothers me so much when I see it…

Actually, it is important. Do what this guy does: http://javascript.crockford.com/code.html

Comments

You might know this already, but comments are blocks of text that are completely ignored by JavaScript, and are used to make it easier for other programmers (and yourself) to understand the code.

In JavaScript, there are two kinds of comments.

Line-ending comments are one-line comments:

// hello, I'm a comment

Block comments, for longer comments:

/*
 put whatever here, JavaScript will ignore it (unless you're commenting out
 a regular expression containing */, which would cause this block
 to close prematurely)
 */

Names

Character set

Internally, JavaScript uses the UCS-2 character set. This means that whatever you throw at it, it will convert it into UCS-2, because that’s all it knows.

This is a great article you could read: https://mathiasbynens.be/notes/javascript-encoding. If you need proof, this proves that JavaScript doesn’t support surrogate pairs: http://jsfiddle.net/a8L6gzjv/.

If you don’t know what a character set is, or you don’t fully understand how character sets work, you MUST read The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!). Also, here’s a great video about Unicode: youtube.com/watch?v=MijmeoH9LT4—definitely worth watching.

Valid names in JavaScript

Names are used for:

  • statements
  • variables
  • parameters
  • property names
  • operators
  • labels

As a programmer, you pick variables, function names, parameters, property names, and labels. As a whole, these can be referred to as identifier names.

Reserved words

Reserved words are names that cannot be used as identifier names, because they have a special meaning to the language, or have been reserved to possibly have a special meaning in future versions.

The ECMAScript specification describes four groups of reserved words:

  1. keywords
  2. future reserved words
  3. null literals
  4. boolean literals

Keywords are tokens that have special meaning in JavaScript:

  • break
  • case
  • catch
  • continue
  • debugger
  • default
  • delete
  • do
  • else
  • finally
  • for
  • function
  • if
  • in
  • instanceof
  • new
  • return
  • switch
  • this
  • throw
  • try
  • typeof
  • var
  • void
  • while
  • with

Future reserved words are tokens that may become keywords in the future:

  • class
  • const
  • enum
  • export
  • extends
  • import
  • super

Some future reserved words only apply in strict mode:

  • implements
  • interface
  • let
  • package
  • private
  • protected
  • public
  • static
  • yield

Null literals are just null.

Boolean literals are true and false.

In addition to the above, there are immutable or read-only properties of the global object (no reason to understand what that means right now), that you can theoretically use without any error being thrown, but simply wouldn’t do anything, and shouldn’t be used:

  • NaN
  • Infinity
  • undefined

NOTE: starting with ES5, reserved words can be used as property names using dot notation: mySchool.class works (although mySchool[class] doesn’t).

Valid identifiers

The ES spec differentiates between identifiers and identifier names. Identifiers are valid identifier names minus reserved words. The way I see it, that means that identifier names are words that are valid in JavaScript, while identifiers are words that are valid in JavaScript and can be chosen for variables, parameters, labels, etc.

So, what characters are allowed in identifier names?

An identifier must start with $, _, or any character in the Unicode categories “Uppercase letter (Lu)”, “Lowercase letter (Ll)”, “Titlecase letter (Lt)”, “Modifier letter (Lm)”, “Other letter (Lo)”, or “Letter number (Nl)”.

The rest of the string can contain the same characters, plus any U+200C zero width non-joiner characters, U+200D zero width joiner characters, and characters in the Unicode categories “Non-spacing mark (Mn)”, “Spacing combining mark (Mc)”, “Decimal digit number (Nd)”, or “Connector punctuation (Pc)”.

https://mathiasbynens.be/notes/javascript-identifiers

So, normally you’ll read that identifiers in JavaScript can be a letter or underscore followed by a series of letters, underscores, and digits, plus the dollar sign being introduced for machine-generated code (but of course we ignored that recommendation, see how it’s used everywhere in Prototype and later jQuery).

In reality, you can come up with some crazy stuff:

var ಠ_ಠ = 'Dude...';

Also, Unicode escape sequences can be used, and they give the same result as using the corresponding character:

var \u0061 = 'test';

…is the same as the following—and you can use the two interchangeably:

var a = 'test';

Since they act the same, you cannot have invalid identifier names through using escape sequences.

One cool thing you can do, is trick some browsers into letting you use reserved words for identifiers. For instance, the following is invalid:

var var = 'test';

However, this works (although it technically shouldn’t):

var v\u0061r = 'test';

Check out “Valid JavaScript variable names” by Mathias Bynens, which is where I learned a lot about the stuff above.

Conclusion

Actually, that’s pretty much it. There are some cool names you can call your variables, but then you wouldn’t be able to type them, so it’s kind of useless.

Still, it’s good to know.


Featured image: Darwin Bell via photopin cc

NewsletterHot JS/webdev content!

Delivered to your inbox

One thought on “Names + whitespace

Leave a Reply

Your email address will not be published. Required fields are marked *