Blog-Archiv

Dienstag, 30. September 2014

JS got cha

A Gotcha is something that got you. Not you got it, it got you. A Gotcha is something nobody tells you, because it is kind of inevitable, like a hole on a road. "Why should I tell anybody? My car crashed into it too!"

We tend to not talk about bad experiences we had. We don't want to remember that. We more like to talk about our success stories. By that way everybody coming behind us also crashes into that hole. Maybe we don't feel so alone then? One thing is sure: mankind is the dominant species on this planet because it can learn and communicate.

This is about the biggest surprises I had with JavaScript. Normal for a JS programmer, but hard to get used to for a Java programmer.
A road full of holes!

Functions can not be overloaded

When you read code like the following, what would you expect to be the outcome?

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
function foo() {
  console.log("foo() was called");
}

function foo(bar1) {
  console.log("foo(bar1) received "+bar1);
}

function foo(bar1, bar2) {
  console.log("foo(bar1, bar2) received "+bar1+", "+bar2);
}

foo();
foo("bar1");
foo("bar1", "bar2");

Output is:

foo(bar1, bar2) received undefined, undefined
foo(bar1, bar2) received bar1, undefined
foo(bar1, bar2) received bar1, bar2
Obviously all three calls go to the foo(bar1, bar2) function.
There is no function overloading in JavaScript. The last definition of foo() survived, all others were overwritten silently without a warning. A function must be unique by name in its namespace (scope). The parameter list is not significant for a function definition.

What did not prevent our function calls to work without error! Because you can call a JS function that has three parameters with

  • zero
  • one
  • two
  • ... thousand ...
parameters!

Any parameter you do not provide when calling the function will be received as undefined value. Not only that any parameter could be of any type, it could also be absent.

Additionally you can call a function that declares no parameters with as many arguments as you want! That function might work with the global JS arguments variable to get its parameter values.

1
2
3
4
5
6
function foobar() {
  for (var i = 0; i < arguments.length; i++)
    console.log("arguments["+i+"] = "+arguments[i]);
}

foobar("lots", "of", "args", "for", "foobar");

This yields:

arguments[0] = lots
arguments[1] = of
arguments[2] = args
arguments[3] = for
arguments[4] = foobar

Do you want more freedom?

Objects of same "class" can be different

When working with objects, we expect that all instances of a class (or whatever JS defines as a class) will be of the same structure. But in practice a JS Object is a map, and you can add or delete any property on it at any time. And a property can also be a function. Thus objects created by the same constructor function ("class") can be altered to have totally different properties.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
var Cat = function(name) {
  this.name = name;
}

var garfield = new Cat("Garfield");

var catbert = new Cat("Catbert");
delete catbert.name;
catbert.nickname = "Bert";

console.log("garfield's name is "+garfield.name);
console.log("garfield's nickname is "+garfield.nickname);
console.log("catbert's name is "+catbert.name);
console.log("catbert's nickname is "+catbert.nickname);

The output of this is:

garfield's name is Garfield
garfield's nickname is undefined
catbert's name is undefined
catbert's nickname is Bert
The two Cat instances now have nothing in common anymore. Applying catbert instanceof Cat would yield true anyway.

One thing JavaScript is not missing at all: flexibility. This language is made of rubber!

Variables are "hoisted"

Hoisting sails might be a vital task for a ship. Hoisting variable definitions to the function body top might be mortal for a function, because it ignores the programmer's intent. Nonetheless JavaScript does such. It pulls any local variable out of its block and puts it to the top of the function body.

Consider following code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
    var connectExpandControls = function() {
      var expandControls = document.body.getElementsByClassName("expandcontrol");

      for (var i = 0; i < expandControls.length; i++) {
        var expandControl = expandControls[i];
        var parent = expandControl.parentNode;
        var children = parent.children;
        var nextSibling, previous;

        for (var j = 0; j < children.length && ! nextSibling; j++) {
            var element = parent.children[j];
            if (previous === expandControl)
              nextSibling = element; // breaks loop
            else
              previous = element;
          }
        }
          
        if (nextSibling)
          connect(expandControl, nextSibling);
      }
    };

This code loops all elements with class "expandcontrol". For each of them it searches the next sibling DOM element, and if there is one, it connects that sibling with the expandcontrol (whatever that means).

But this is wrong! See the bug? Would be interesting how long an experienced JS programmer might need to find it.

This code searches the next sibling for the first expandcontrol, and then it connects all other expandcontrol instances to that first found sibling!

Other languages like C++ or Java would limit the existence (and initialization) of a local variable to the { block braces } where it has been written into. But today life is more complicated. JavaScript "hoists" all local variables within a function body to the top of the function, out of their scopes. Thus the variable nextSibling will get a value at first loop pass, and then keep this value, because it is not created newly each time the outer loop block is entered! Consequence is that the sibling-search-loop won't be executed for any further expandcontrol than the first, because afterwards nextSibling already has a value.

Here is a fixed version of the code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
    var connectExpandControls = function() {
      var expandControls = ....

      for (var i = 0; i < expandControls.length; i++) {
        ....
        var nextSibling = undefined, previous = undefined;

        for (var j = 0; j < children.length && ! nextSibling; j++) {
            var element = parent.children[j];
            if (previous === expandControl)
              nextSibling = element; // breaks loop
            else
              previous = element;
          }
        }
          
        ....
      }
    };

The difference is that the variables nextSibling and previous are always reset to undefined now, any time the loop is entered. Thus the inner loop is executed at each pass.

Following is the fix variant that would be recommended by a lot of JS programmers. They argue:

"Do not behave as if variables were not hoisted, write them to where the interpreter will put them anyway, so you might keep control of their values".
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
    var connectExpandControls = function() {
      var expandControls = document.body.getElementsByClassName("expandcontrol");
      var expandControl;
      var parent;
      var children;
      var nextSibling, previous;
      var element;

      for (var i = 0; i < expandControls.length; i++) {
        expandControl = expandControls[i];
        parent = expandControl.parentNode;
        children = parent.children;
        nextSibling = undefined;
        previous = undefined;

        for (var j = 0; j < children.length && ! nextSibling; j++) {
            element = parent.children[j];
            if (previous === expandControl)
              nextSibling = element; // breaks loop
            else
              previous = element;
          }
        }
          
        if (nextSibling)
          connect(expandControl, nextSibling);
      }
    };

This is the reason why every JS function starts with an endless list of local variables. Moreover JS functions are used as modules, and these contain a lot of other functions and even sub-modules. So the scope and visibility of "local" variables becomes uncontrollable (just because the interpreter optimizes by variable-hoisting).

This makes life hard when refactoring big functions by splitting them into smaller ones. And there are a lot of big functions out there!

No dependency definitions

C and C++ have include directives, Java and other more modern languages prefer import (to avoid preprocessors), but JavaScript has nothing at all. To realize what that means, imagine following JS code dependencies of an HTML page:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
  <script type="text/javascript" src="js/folding.js"></script>
  <script type="text/javascript" src="js/ajax.js"></script>
  <script type="text/javascript" src="js/sourceDisplay.js"></script>

  <script type="text/javascript">
    "use strict";
    
    window.addEventListener("load", function() {
      var target = document.getElementById("sourceGoesHere");
      var sourceDisplayer = sourceDisplay.create();
      sourceDisplayer.displayPage(target);
    });
  </script>

This HTML page loads three scripts, although it obviously uses just one JS object: sourceDisplay. It does so because sourceDisplay won't work when folding and ajax are missing.

Here is an outline of the dependencies as they exist in the according JS code:

  • sourceDisplay.js
    • folding.js
    • ajax.js

That means, not only JS source code contains dependencies to external variables or functions, also the HTML page that uses that code repeats these dependencies. It is a necessary consequence that these dependency definitions break sometime.

So how can you import a JavaScript reliably into your HTML page?

Carefully read the JS code and look for variables or functions that are not defined. When you found them, you have to look around for JS files that define the missing identifiers. Having found all definitions and having eliminated ambiguities, you can finally write script tags into the HTML page. This will hold until the next release, where you must do this again. Releases are weekly :-)

Hard and soft comparisons

In JS there are two different compare operators. Additionally to the traditional "==" there is also a "===", and for the negation "!=" you have a "!==". The semantics are different, "==" is called equality (soft), "===" identity (hard).

null == undefinedtrue
null === undefinedfalse
null == 0false
null === 0false
null == '0'false
null === '0'false
0 == undefinedfalse
0 === undefinedfalse
0 == '0'true
0 === '0'false
0 == ''true
0 === ''false
0 == new String('')true
0 === new String('')false
0 == '\t\r\n 'true
0 === '\t\r\n 'false
'' == undefinedfalse
'' === undefinedfalse
'' == new String('')true
'' === new String('')false
'' == falsetrue
'' === falsefalse
false == 'false'false
false === 'false'false
false == '0'true
false === '0'false
false == 0true
false === 0false
false == undefinedfalse
false === undefinedfalse
false == nullfalse
false === nullfalse

This is near to science, look at these charts on stackoverflow.
Fact is that "==" tries to coerce the types of the variables being compared, while "===" does not do that (and thus is also faster).
Identity comparison is what we mostly would expect.
I use only "===" and "!==", no more "==" and "!=".

And this is how an if condition works with these expressions:

if (undefined)false
if (null)false
if (0)false
if ('0')true
if ('')false
if (new String(''))true
if ('\t\r\n ')true

Due to such strange behaviors you can not substitute a

if ( ! rainy )
by a
if ( rainy === false )
in JavaScript, like you can in Java at any time, without even thinking.
When rainy would be undefined or null, you might get quite unexpected results, because undefined !== false and null !== false ...

Parameter default assignments

Another nice pitfall is the popular default definition for parameters, which goes badly wrong for boolean parameters:

var calculateWidth = function(element, calculateMaximum) {
  var maximum = calculateMaximum || true; // don't do this!
  if (maximum)
    ....
}

Intent of this is to give a default value of true to the variable maximum when that parameter has not been provided by the caller. But the outcome is fatal, because maximum will never be false. Consider following cases:

  calculateWidth(theElement);
  // var maximum = undefined || true;
  // -> this works as intended

  calculateWidth(theElement, false);
  // var maximum = false || true;
  // -> as false won't ever evaluate to true, maximum will ALWAYS be true!

So you always must know the parameter type when doing such.
At least for boolean types you must do the following:

var calculateWidth = function(element, calculateMaximum) {
  var maximum = (calculateMaximum !== undefined) ? calculateMaximum : true; // correct!
  ....
}


Keine Kommentare: