What's Java's number one problem? If you said "efficiency," then you win! Why do we still use C and C++? If you said "efficiency," then you also win!
And so... yesterday, we learned about templates and generics and so forth, and I was able to ask a question that's been bugging me for quite some time. "When you declare an ArrayList<Integer>, are Integer Objects being stored internally, or are primitive ints being stored?"
Prior to yesterday, I had given this quite a bit of thought, actually. I'd say about a year ago, I determined that since the classes like Integer, Byte, Float were final, non-inheritable classes, that Java could actually optimize the language by artificially storing primitive ints, auto-unboxing and auto-boxing them, as appropriate. But, as it turned out, I was wrong!
Java does not do that. Instead, when you store a bunch of "primitive ints" in an ArrayList<Integer>, what actually happens is that the ints are wrapped by Integer Objects, and the Integer Objects are essentially up-casted to Object internally and stored in the ArrayList. Then, when you grab something from the ArrayList, it is essentially down-casted to Integer again, and converted back to a primitive int, if necessary. This solution provides backward compatibility, but it is VERY inefficient.
Observe...
If I have a normal Java array of int of 10,000 elements...
int[] foo = new int[10000];
The memory allocated here is (10,000 elements) * (4 bytes per element) + 16 byte array overhead = 40016 bytes
The array has a length (integer), and you should also count the pointer to the array, etc.
If I have an ArrayList<Integer> of 10,000 Integer Objects...
ArrayList<Integer> foo = new ArrayList<Integer> (10000);
The memory allocated here is (8 bytes for Object) + (4 bytes for 32-bit pointer to actual array) + (4 bytes for "size" variable) + (16 bytes of array overhead) + (4 bytes from parent class instance variable -- another size) + (4 bytes of crap overhead -- don't ask) + (4 bytes per Object reference * 10000 elements) = 40040 bytes... all to store null Integer Object references. If you actually place integers into the ArrayList, then each Integer Object will take up 16 bytes -- (8 bytes for Object) + (4 bytes for int inside Object) + (4 bytes of crap overhead). So, if you actually fill that ArrayList<Integer> with data, you will use 200,040 bytes. That's roughly FIVE times more than what was really required!!
NOT TO MENTION THE ADDITIONAL INEFFICIENCY OF AUTO-BOXING/UNBOXING AND DE-REFERENCING THAT ACTUALLY GOES ON BEHIND THE SCENES!
So why is Java slow? Hmm... disappointing.
And so... yesterday, we learned about templates and generics and so forth, and I was able to ask a question that's been bugging me for quite some time. "When you declare an ArrayList<Integer>, are Integer Objects being stored internally, or are primitive ints being stored?"
Prior to yesterday, I had given this quite a bit of thought, actually. I'd say about a year ago, I determined that since the classes like Integer, Byte, Float were final, non-inheritable classes, that Java could actually optimize the language by artificially storing primitive ints, auto-unboxing and auto-boxing them, as appropriate. But, as it turned out, I was wrong!
Java does not do that. Instead, when you store a bunch of "primitive ints" in an ArrayList<Integer>, what actually happens is that the ints are wrapped by Integer Objects, and the Integer Objects are essentially up-casted to Object internally and stored in the ArrayList. Then, when you grab something from the ArrayList, it is essentially down-casted to Integer again, and converted back to a primitive int, if necessary. This solution provides backward compatibility, but it is VERY inefficient.
Observe...
If I have a normal Java array of int of 10,000 elements...
int[] foo = new int[10000];
The memory allocated here is (10,000 elements) * (4 bytes per element) + 16 byte array overhead = 40016 bytes
The array has a length (integer), and you should also count the pointer to the array, etc.
If I have an ArrayList<Integer> of 10,000 Integer Objects...
ArrayList<Integer> foo = new ArrayList<Integer> (10000);
The memory allocated here is (8 bytes for Object) + (4 bytes for 32-bit pointer to actual array) + (4 bytes for "size" variable) + (16 bytes of array overhead) + (4 bytes from parent class instance variable -- another size) + (4 bytes of crap overhead -- don't ask) + (4 bytes per Object reference * 10000 elements) = 40040 bytes... all to store null Integer Object references. If you actually place integers into the ArrayList, then each Integer Object will take up 16 bytes -- (8 bytes for Object) + (4 bytes for int inside Object) + (4 bytes of crap overhead). So, if you actually fill that ArrayList<Integer> with data, you will use 200,040 bytes. That's roughly FIVE times more than what was really required!!
NOT TO MENTION THE ADDITIONAL INEFFICIENCY OF AUTO-BOXING/UNBOXING AND DE-REFERENCING THAT ACTUALLY GOES ON BEHIND THE SCENES!
So why is Java slow? Hmm... disappointing.
Comments