首先需要说明的是,BitSet并不属于集合框架,没有实现Collection或Map接口。但因为其与List有一定类似性,所以这里一并列拿出来研究。
BitSet类实现了一个按需增长的位向量。每个位都有一个boolean值,用非负的整数将BitSet的位编入索引。可以对每个编入索引的位进行查找、设置或者清除。默认情况下,所有位的初始值都是false。BitSet非线程安全,在单线程情况下使用。
BitSet更多的表示一种开关信息,对于海量不重复数据,利用索引表示数据的方式,将会大大节省空间使用。
BitSet的大小与实际申请的大小并不一定一样,BitSet的size方法打印出的大小一定是64的倍数,这与它的实际申请代码有关,假设以下面的代码实例化一个BitSet:
BitSet set = new BitSet(129);
/** * Creates a bit set whose initial size is large enough to explicitly * represent bits with indices in the range <code>0</code> through * <code>nbits-1</code>. All bits are initially <code>false</code>. * * @param nbits the initial size of the bit set. * @exception NegativeArraySizeException if the specified initial size * is negative. */ public BitSet(int nbits) { // nbits can't be negative; size 0 is OK if (nbits < 0) throw new NegativeArraySizeException("nbits < 0: " + nbits); initWords(nbits); sizeIsSticky = true; } private void initWords(int nbits) { words = new long[wordIndex(nbits-1) + 1]; }
/** * Given a bit index, return word index containing it. */ private static int wordIndex(int bitIndex) { return bitIndex >> ADDRESS_BITS_PER_WORD; }
private final static int ADDRESS_BITS_PER_WORD = 6;
/** * Sets the bit at the specified index to <code>true</code>. * * @param bitIndex a bit index. * @exception IndexOutOfBoundsException if the specified index is negative. * @since JDK1.0 */ public void set(int bitIndex) { if (bitIndex < 0) throw new IndexOutOfBoundsException("bitIndex < 0: " + bitIndex); int wordIndex = wordIndex(bitIndex); expandTo(wordIndex); words[wordIndex] |= (1L << bitIndex); // Restores invariants checkInvariants(); }
words[wordIndex] |= (1L << bitIndex); // Restores invariants
与set方法相对的一个方法是clear方法,两者大同小异:
/** * Sets the bit specified by the index to <code>false</code>. * * @param bitIndex the index of the bit to be cleared. * @exception IndexOutOfBoundsException if the specified index is negative. * @since JDK1.0 */ public void clear(int bitIndex) { if (bitIndex < 0) throw new IndexOutOfBoundsException("bitIndex < 0: " + bitIndex); int wordIndex = wordIndex(bitIndex); if (wordIndex >= wordsInUse) return; words[wordIndex] &= ~(1L << bitIndex); recalculateWordsInUse(); checkInvariants(); }
/** * Ensures that the BitSet can accommodate a given wordIndex, * temporarily violating the invariants. The caller must * restore the invariants before returning to the user, * possibly using recalculateWordsInUse(). * @param wordIndex the index to be accommodated. */ private void expandTo(int wordIndex) { int wordsRequired = wordIndex+1; if (wordsInUse < wordsRequired) { ensureCapacity(wordsRequired); wordsInUse = wordsRequired; } }
wordsInUse表示的是BitSet中的long型数组words的大小。当我们传进一个wordIndex的时候,首先需要判断这个逻辑大小与wordIndex的大小关系,如果小于它,我们就调用方法ensureCapacity:
private void ensureCapacity(int wordsRequired) { if (words.length < wordsRequired) { // Allocate larger of doubled size or required size int request = Math.max(2 * words.length, wordsRequired); words = Arrays.copyOf(words, request); sizeIsSticky = false; } }
先将words的大小变为原来的两倍(如果仍然不够,则将大小直接变为需要的长度,这与Vector很像)。然后复制数组。最后将wordsInUse设置为wordsRequired。
与之相对的,clear方法虽然不可能涉及到扩容操作,但也需要判断是否越界:
if (wordIndex >= wordsInUse) return;
/** * Returns the value of the bit with the specified index. The value * is <code>true</code> if the bit with the index <code>bitIndex</code> * is currently set in this <code>BitSet</code>; otherwise, the result * is <code>false</code>. * * @param bitIndex the bit index. * @return the value of the bit with the specified index. * @exception IndexOutOfBoundsException if the specified index is negative. */ public boolean get(int bitIndex) { if (bitIndex < 0) throw new IndexOutOfBoundsException("bitIndex < 0: " + bitIndex); checkInvariants(); int wordIndex = wordIndex(bitIndex); return (wordIndex < wordsInUse) && ((words[wordIndex] & (1L << bitIndex)) != 0); }
3)size方法:
/** * Returns the number of bits of space actually in use by this * <code>BitSet</code> to represent bit values. * The maximum element in the set is the size - 1st element. * * @return the number of bits currently in this bit set. */ public int size() { return words.length * BITS_PER_WORD; }
这里也有一个常量,定义如下:
private final static int ADDRESS_BITS_PER_WORD = 6; private final static int BITS_PER_WORD = 1 << ADDRESS_BITS_PER_WORD;
/** * Returns the "logical size" of this <code>BitSet</code>: the index of * the highest set bit in the <code>BitSet</code> plus one. Returns zero * if the <code>BitSet</code> contains no set bits. * * @return the logical size of this <code>BitSet</code>. * @since 1.2 */ public int length() { if (wordsInUse == 0) return 0; return BITS_PER_WORD * (wordsInUse - 1) + (BITS_PER_WORD - Long.numberOfLeadingZeros(words[wordsInUse - 1])); }
这个方法法返回的是BitSet的逻辑大小,比如声明了一个129位的BitSet(实际是192位),设置了第23,45,67位为true,那么其逻辑大小就是68(从0开始计数)。其中numberOfLeadingZeros方法是输出二进制字符串左边开始0的个数(由于long型数据左边是高位,实际上是BitSet中靠后的bit),也就是最后一个long型数据所代表的bit中,最后一个1后面的bit数,用64减去这个值,在加上前面的long型数据个数乘以64,就可以得到逻辑大小。(值得一提的是,BitSet中的第0位实际存在第1个long型数据的第1位,后面的以此类推)
参考地址:http://blog.csdn.net/wxwzy738/article/details/8879423