colossus_bigdata

spark源码学习（十）--- blockManager分析

blockManager主要原理：

blockmanager位于org.apache.spark.storage中，包含四个重要的组件：DiskStore，MemoryStore，Blocktransferservice，ConnectionManager。其中，diskStore负责对磁盘上的数据读写；memoryStore负责内存数据的读写，connectionManager负责到远程节点的连接，BlockManagerWorker负责读写远程节点的的数据。当blockManager启动创建后会向blockManagerMaster注册，其中blockManagerMaster位于driver上，管理者数据的元数据，比如包含了blockmanagerInfo，blockStatus，当blockManagerMaster进行了增删改操作，blockManager会通知blockManagerMaster，blockManagerMaster通过blockManagerInfo内的blockStatus进行元数据的操作。

首先看位于org.apache.spark.storage中的blockManagerMaster，重要的功能在BlockManagerMasterActor类中定义，下面分析blockManagerMasterInfo类：

首先，持有一个blockManagerInfo的hashmap，记录了BlockManagerId与BlockManagerInfo的映射，BlockManagerInfo记录blockManager的一些元数据信息：

private val blockManagerInfo = new mutable.HashMap[BlockManagerId, BlockManagerInfo]

另外一个重要的成员映射，executor与blockManager的映射：

private val blockManagerIdByExecutor = new mutable.HashMap[String, BlockManagerId]

下面来看blockManager的注册：

  private def register(id: BlockManagerId, maxMemSize: Long, slaveActor: ActorRef) {
    val time = System.currentTimeMillis()
    //如果没有注册过，则去注册blockManager
    if (!blockManagerInfo.contains(id)) {
      // BlockManagerId包含有成员变量executorID，通过BlockManagerId找到executorID
      // 然后判断该executorID是否存在，如果存在，那么将存在的该executorid对应的BlockManagerId移除
      // 因为此处是在!blockManagerInfo.contains(id)这个条件下，所以必须没有该executorid对应的BlockManagerId
      blockManagerIdByExecutor.get(id.executorId) match {
        case Some(oldId) =>
          // A block manager of the same executor already exists, so remove it (assumed dead)
          logError("Got two different block manager registrations on same executor - " 
              + s" will replace old one $oldId with new one $id")
          removeExecutor(id.executorId)  
        case None =>
      }
      logInfo("Registering block manager %s with %s RAM, %s".format(
        id.hostPort, Utils.bytesToString(maxMemSize), id))
      //将新的executorID与BlockManagerId映射起来，key为executorId，value为BlockManagerId
      blockManagerIdByExecutor(id.executorId) = id
      //生成blockManagerInfo与BlockManagerId的映射
      blockManagerInfo(id) = new BlockManagerInfo(
        id, System.currentTimeMillis(), maxMemSize, slaveActor)
    }
    listenerBus.post(SparkListenerBlockManagerAdded(time, id, maxMemSize))
  }

更新blockInfo，每个blockmanager上，如果block发生了变化都会调用updateBlockInfo进行blockInfo的更新：

  private def updateBlockInfo(
      blockManagerId: BlockManagerId,
      blockId: BlockId,
      storageLevel: StorageLevel,
      memSize: Long,
      diskSize: Long,
      tachyonSize: Long): Boolean = {

    if (!blockManagerInfo.contains(blockManagerId)) {
      if (blockManagerId.isDriver && !isLocal) {
        // We intentionally do not register the master (except in local mode),
        // so we should not indicate failure.
        return true
      } else {
        return false
      }
    }

    if (blockId == null) {
      blockManagerInfo(blockManagerId).updateLastSeenMs()
      return true
    }

    blockManagerInfo(blockManagerId).updateBlockInfo(
      blockId, storageLevel, memSize, diskSize, tachyonSize)
    var locations: mutable.HashSet[BlockManagerId] = null
    if (blockLocations.containsKey(blockId)) {
      locations = blockLocations.get(blockId)
    } else {
      locations = new mutable.HashSet[BlockManagerId]
      blockLocations.put(blockId, locations)
    }

    if (storageLevel.isValid) {
      locations.add(blockManagerId)
    } else {
      locations.remove(blockManagerId)
    }

    // Remove the block from master tracking if it has been removed on all slaves.
    if (locations.size == 0) {
      blockLocations.remove(blockId)
    }
    true
  }

下面看blockManager类，首先，来看blockManager的类定义：

private[spark] class BlockManager(
    executorId: String,
    actorSystem: ActorSystem,
    val master: BlockManagerMaster,
    defaultSerializer: Serializer,
    maxMemory: Long,
    val conf: SparkConf,
    mapOutputTracker: MapOutputTracker,
    shuffleManager: ShuffleManager,
    blockTransferService: BlockTransferService,
    securityManager: SecurityManager,
    numUsableCores: Int)
  extends BlockDataManager with Logging

blockManager中管理的几种存储级别：内存，磁盘，tachyon，每种存储级别会有对应的类进行数据的操作，分别是memoryStore，diskStore，tachyonStore。

  private[spark] val memoryStore = new MemoryStore(this, maxMemory)
  private[spark] val diskStore = new DiskStore(this, diskBlockManager)
  private[spark] lazy val tachyonStore: TachyonStore = {
    val storeDir = conf.get("spark.tachyonStore.baseDir", "/tmp_spark_tachyon")
    val appFolderName = conf.get("spark.tachyonStore.folderName")
    val tachyonStorePath = s"$storeDir/$appFolderName/${this.executorId}"
    val tachyonMaster = conf.get("spark.tachyonStore.url",  "tachyon://localhost:19998")
    val tachyonBlockManager =
      new TachyonBlockManager(this, tachyonStorePath, tachyonMaster)
    tachyonInitialized = true
    new TachyonStore(this, tachyonBlockManager)
  }

在blockManager初始化的时候回调用initialize方法：

  def initialize(appId: String): Unit = {
    blockTransferService.init(this)
    shuffleClient.init(appId)
    //一个blockManager对应一个executorId，blockTransferService的host，port
    blockManagerId = BlockManagerId(
      executorId, blockTransferService.hostName, blockTransferService.port)

    shuffleServerId = if (externalShuffleServiceEnabled) {
      BlockManagerId(executorId, blockTransferService.hostName, externalShuffleServicePort)
    } else {
      blockManagerId
    }
    //像BlockManagerMaster注册blockManager
    master.registerBlockManager(blockManagerId, maxMemory, slaveActor)

    // Register Executors' configuration with the local shuffle service, if one should exist.
    if (externalShuffleServiceEnabled && !blockManagerId.isDriver) {
      registerWithExternalShuffleServer()
    }
  }

blockManager获取数据的方法doGetLocal：

首先来看读取内存存储数据的情况：

  private def doGetLocal(blockId: BlockId, asBlockResult: Boolean): Option[Any] = {
    //orNull：option方法，如果它不为空返回该选项的值，如果它是空则返回null。
    //blockInfo：TimeStampedHashMap[BlockId, BlockInfo]
    val info = blockInfo.get(blockId).orNull
    if (info != null) {
      info.synchronized {
        // Double check to make sure the block is still there. There is a small chance that the
        // block has been removed by removeBlock (which also synchronizes on the blockInfo object).
        // Note that this only checks metadata tracking. If user intentionally deleted the block
        // on disk or from off heap storage without using removeBlock, this conditional check will
        // still pass but eventually we will get an exception because we can't find the block.
        //判断blockInfo是否为空，blockInfo记录了block的元数据信息
        //如果通过调用程序来移除block，比如认为操作移除block的话，会发生此处的情况
        if (blockInfo.get(blockId).isEmpty) {
          logWarning(s"Block $blockId had been removed")
          return None
        }

        // If another thread is writing the block, wait for it to become ready.
        //如果其他线程正在操作该block ，那么等待
        if (!info.waitForReady()) {
          // If we get here, the block write failed.
          logWarning(s"Block $blockId was marked as failure.")
          return None
        }
        //获取存储级别，内存、tachyon、是否内存或者tachyon沾满后会刷到磁盘，是否需要多个副本
        val level = info.level
        logDebug(s"Level for block $blockId is $level")

        // Look for the block in memory
        //数据存储在内存的情况
        //调用memoryStore的getValues与getBytes来读取数据
        if (level.useMemory) {
          logDebug(s"Getting block $blockId from memory")
          val result = if (asBlockResult) {
            //需要的是非序列化的数据
            memoryStore.getValues(blockId).map(new BlockResult(_, DataReadMethod.Memory, info.size))
          } else {
            //需要的是序列化的数据
            memoryStore.getBytes(blockId)
          }
          result match {
            case Some(values) =>
              return result
            case None =>
              logDebug(s"Block $blockId not found in memory")
          }
        }

这里根据获取的数据是否需要序列化来分别调用getValues和getBytes方法，getValues获取的是非序列化数据：

  override def getValues(blockId: BlockId): Option[Iterator[Any]] = {
    val entry = entries.synchronized {
      entries.get(blockId)
    }
    if (entry == null) {
      None
    } else if (entry.deserialized) {
      //非序列化数据。直接返回
      Some(entry.value.asInstanceOf[Array[Any]].iterator)
    } else {
      //序列化数据，反序列化后返回
      val buffer = entry.value.asInstanceOf[ByteBuffer].duplicate() // Doesn't actually copy data
      Some(blockManager.dataDeserialize(blockId, buffer))
    }
  }

getBytes获取的是序列化数据：

  override def getBytes(blockId: BlockId): Option[ByteBuffer] = {
    val entry = entries.synchronized {
      //从内存中获取数据
      entries.get(blockId)
    }
    if (entry == null) {
      None
    } else if (entry.deserialized) {// 如果获取的数据是非序列化的数据，那么序列化数据后返回，否则直接返回
      Some(blockManager.dataSerialize(blockId, entry.value.asInstanceOf[Array[Any]].iterator))
    } else {
      Some(entry.value.asInstanceOf[ByteBuffer].duplicate()) // Doesn't actually copy the data
    }
  }

下面分析从磁盘读取数据的情况，分为两种：一是只使用磁盘，二是数据既使用了磁盘也使用了内存：

        if (level.useDisk) {
          logDebug(s"Getting block $blockId from disk")
          val bytes: ByteBuffer = diskStore.getBytes(blockId) match {
            case Some(b) => b
            case None =>
              throw new BlockException(
                blockId, s"Block $blockId not found on disk, though it should be")
          }
          assert(0 == bytes.position())
          //如果只使用磁盘没有使用内存
          if (!level.useMemory) {
            // If the block shouldn't be stored in memory, we can just return it
            if (asBlockResult) {
              return Some(new BlockResult(dataDeserialize(blockId, bytes), DataReadMethod.Disk,
                info.size))
            } else {
              return Some(bytes)
            }
            //如果使用磁盘和内存混合存储
          } else {
            // Otherwise, we also have to store something in the memory store
            if (!level.deserialized || !asBlockResult) {
              /* We'll store the bytes in memory if the block's storage level includes
               * "memory serialized", or if it should be cached as objects in memory
               * but we only requested its serialized bytes. */
              val copyForMemory = ByteBuffer.allocate(bytes.limit)
              copyForMemory.put(bytes)
              memoryStore.putBytes(blockId, copyForMemory, level)
              bytes.rewind()
            }
            if (!asBlockResult) {
              return Some(bytes)
            } else {
              val values = dataDeserialize(blockId, bytes)
              if (level.deserialized) {
                // Cache the values before returning them
                val putResult = memoryStore.putIterator(
                  blockId, values, level, returnValues = true, allowPersistToDisk = false)
                // The put may or may not have succeeded, depending on whether there was enough
                // space to unroll the block. Either way, the put here should return an iterator.
                putResult.data match {
                  case Left(it) =>
                    return Some(new BlockResult(it, DataReadMethod.Disk, info.size))
                  case _ =>
                    // This only happens if we dropped the values back to disk (which is never)
                    throw new SparkException("Memory store did not return an iterator!")
                }
              } else {
                return Some(new BlockResult(values, DataReadMethod.Disk, info.size))
              }
            }
          }
        }
      }
    } else {
      logDebug(s"Block $blockId not registered locally")
    }
    None
  }

上面是从本地读取数据的情况源码分析，除此之外还有从远程读取数据的情况，远程读取数据的情况在doGetRomote中：

  private def doGetRemote(blockId: BlockId, asBlockResult: Boolean): Option[Any] = {
     //判断，如果条件不满足，则抛出异常
    require(blockId != null, "BlockId is null")
    //打乱block所在位置，以便均衡
    val locations = Random.shuffle(master.getLocations(blockId))
    //循环读取所有位置的数据
    for (loc <- locations) {
      logDebug(s"Getting remote block $blockId from $loc")
      //远程读取数据
      val data = blockTransferService.fetchBlockSync(
        loc.host, loc.port, loc.executorId, blockId.toString).nioByteBuffer()
      if (data != null) {
        if (asBlockResult) {
          //返回的是序列化的数据，如果不需要序列化，则进行反序列化
          return Some(new BlockResult(
            dataDeserialize(blockId, data),
            DataReadMethod.Network,
            data.limit()))
        } else {
          return Some(data)
        }
      }
      logDebug(s"The value of block $blockId is null")
    }
    logDebug(s"Block $blockId not found")
    None
  }

以上分析的书读数据的两种情况：读取本地数据和读取远程数据。下面分析写数据，写数据由doPut方法来管理：

  private def doPut(
      blockId: BlockId,
      data: BlockValues,
      level: StorageLevel,
      tellMaster: Boolean = true,
      effectiveStorageLevel: Option[StorageLevel] = None)
    : Seq[(BlockId, BlockStatus)] = {

    require(blockId != null, "BlockId is null")
    require(level != null && level.isValid, "StorageLevel is null or invalid")
    effectiveStorageLevel.foreach { level =>
      require(level != null && level.isValid, "Effective StorageLevel is null or invalid")
    }

    // Return value
    //blockStatus中封装了block的一些信息：
    /*
     *     	storageLevel: StorageLevel,
            memSize: Long,
            diskSize: Long,
            tachyonSize: Long
     */
    val updatedBlocks = new ArrayBuffer[(BlockId, BlockStatus)]

    /* Remember the block's storage level so that we can correctly drop it to disk if it needs
     * to be dropped right after it got put into memory. Note, however, that other threads will
     * not be able to get() this block until we call markReady on its BlockInfo. */
    //为将写入的block生成blockInfo并写入map中
    val putBlockInfo = {
      val tinfo = new BlockInfo(level, tellMaster)
      // Do atomically !
      //如果不存在该info信息，那么将blockId与 BlockInfo关联起来，放入map
      val oldBlockOpt = blockInfo.putIfAbsent(blockId, tinfo)
      if (oldBlockOpt.isDefined) {
        if (oldBlockOpt.get.waitForReady()) {
          logWarning(s"Block $blockId already exists on this machine; not re-adding it")
          return updatedBlocks
        }
        // TODO: So the block info exists - but previous attempt to load it (?) failed.
        // What do we do now ? Retry on it ?
        oldBlockOpt.get
      } else {
        tinfo
      }
    }

    val startTimeMs = System.currentTimeMillis

    /* If we're storing values and we need to replicate the data, we'll want access to the values,
     * but because our put will read the whole iterator, there will be no values left. For the
     * case where the put serializes data, we'll remember the bytes, above; but for the case where
     * it doesn't, such as deserialized storage, let's rely on the put returning an Iterator. */
    var valuesAfterPut: Iterator[Any] = null

    // Ditto for the bytes after the put
    var bytesAfterPut: ByteBuffer = null

    // Size of the block in bytes
    var size = 0L

    // The level we actually use to put the block
    val putLevel = effectiveStorageLevel.getOrElse(level)

    // If we're storing bytes, then initiate the replication before storing them locally.
    // This is faster as data is already serialized and ready to send.
    val replicationFuture = data match {
      case b: ByteBufferValues if putLevel.replication > 1 =>
        // Duplicate doesn't copy the bytes, but just creates a wrapper
        val bufferView = b.buffer.duplicate()
        Future { replicate(blockId, bufferView, putLevel) }
      case _ => null
    }
    //对blockInfo 加锁，多线程同步
    putBlockInfo.synchronized {
      logTrace("Put for block %s took %s to get into synchronized block"
        .format(blockId, Utils.getUsedTimeMs(startTimeMs)))

      var marked = false
      try {
        // returnValues - Whether to return the values put
        // blockStore - The type of storage to put these values into
        // blockStore - 存储方式：内存磁盘还是tachyon
        val (returnValues, blockStore: BlockStore) = {
          //使用内存
          if (putLevel.useMemory) {
            // Put it in memory first, even if it also has useDisk set to true;
            // We will drop it to disk later if the memory store can't hold it.
            (true, memoryStore)
            //使用tachyon
          } else if (putLevel.useOffHeap) {
            // Use tachyon for off-heap storage
            (false, tachyonStore)
            //使用磁盘
          } else if (putLevel.useDisk) {
            // Don't get back the bytes from put unless we replicate them
            (putLevel.replication > 1, diskStore)
          } else {
            //否则，抛出没有指定正确的存储级别错误
            assert(putLevel == StorageLevel.NONE)
            throw new BlockException(
              blockId, s"Attempted to put block $blockId without specifying storage level!")
          }
        }

        // Actually put the values
        // 根据选择的store和数据类型，放入store中，putIterator方法写入数据并返回写入数据量等信息
        val result = data match {
          case IteratorValues(iterator) =>
            blockStore.putIterator(blockId, iterator, putLevel, returnValues)
          case ArrayValues(array) =>
            blockStore.putArray(blockId, array, putLevel, returnValues)
          case ByteBufferValues(bytes) =>
            bytes.rewind()
            blockStore.putBytes(blockId, bytes, putLevel)
        }
        size = result.size
        result.data match {
          case Left (newIterator) if putLevel.useMemory => valuesAfterPut = newIterator
          case Right (newBytes) => bytesAfterPut = newBytes
          case _ =>
        }

        // Keep track of which blocks are dropped from memory
        if (putLevel.useMemory) {
          result.droppedBlocks.foreach { updatedBlocks += _ }
        }
        //获取block对应的status
        val putBlockStatus = getCurrentBlockStatus(blockId, putBlockInfo)
        if (putBlockStatus.storageLevel != StorageLevel.NONE) {
          // Now that the block is in either the memory, tachyon, or disk store,
          // let other threads read it, and tell the master about it.
          marked = true
          putBlockInfo.markReady(size)
          if (tellMaster) {
            //向master通知blockstatus，更新元数据信息
            reportBlockStatus(blockId, putBlockInfo, putBlockStatus)
          }
          updatedBlocks += ((blockId, putBlockStatus))
        }
      } finally {
        // If we failed in putting the block to memory/disk, notify other possible readers
        // that it has failed, and then remove it from the block info map.
        if (!marked) {
          // Note that the remove must happen before markFailure otherwise another thread
          // could've inserted a new BlockInfo before we remove it.
          blockInfo.remove(blockId)
          putBlockInfo.markFailure()
          logWarning(s"Putting block $blockId failed")
        }
      }
    }
    logDebug("Put block %s locally took %s".format(blockId, Utils.getUsedTimeMs(startTimeMs)))

    // Either we're storing bytes and we asynchronously started replication, or we're storing
    // values and need to serialize and replicate them now:
    if (putLevel.replication > 1) {//数据副本数据大于1，那么复制多份数据
      data match {
        case ByteBufferValues(bytes) =>
          if (replicationFuture != null) {
            Await.ready(replicationFuture, Duration.Inf)
          }
        case _ =>
          val remoteStartTime = System.currentTimeMillis
          // Serialize the block if not already done
          if (bytesAfterPut == null) {
            if (valuesAfterPut == null) {
              throw new SparkException(
                "Underlying put returned neither an Iterator nor bytes! This shouldn't happen.")
            }
            bytesAfterPut = dataSerialize(blockId, valuesAfterPut)
          }
          replicate(blockId, bytesAfterPut, putLevel)//调用该方法复制数据
          logDebug("Put block %s remotely took %s"
            .format(blockId, Utils.getUsedTimeMs(remoteStartTime)))
      }
    }

    BlockManager.dispose(bytesAfterPut)
	
    if (putLevel.replication > 1) {
      logDebug("Putting block %s with replication took %s"
        .format(blockId, Utils.getUsedTimeMs(startTimeMs)))
    } else {
      logDebug("Putting block %s without replication took %s"
        .format(blockId, Utils.getUsedTimeMs(startTimeMs)))
    }

    updatedBlocks
  }

其中，实际写数据是由

	val result = data match {
          case IteratorValues(iterator) =>
            blockStore.putIterator(blockId, iterator, putLevel, returnValues)
          case ArrayValues(array) =>
            blockStore.putArray(blockId, array, putLevel, returnValues)
          case ByteBufferValues(bytes) =>
            bytes.rewind()
            blockStore.putBytes(blockId, bytes, putLevel)
        }

这段代码完成，blockStore根据存储级别分为三种：如果是memoryStore，写入的时候调用了memoryStore的putIterator方法，最后直到调用tryToPut方法：

  private def tryToPut(
      blockId: BlockId,
      value: Any,
      size: Long,
      deserialized: Boolean): ResultWithDroppedBlocks = {

    /* TODO: Its possible to optimize the locking by locking entries only when selecting blocks
     * to be dropped. Once the to-be-dropped blocks have been selected, and lock on entries has
     * been released, it must be ensured that those to-be-dropped blocks are not double counted
     * for freeing up more space for another block that needs to be put. Only then the actually
     * dropping of blocks (and writing to disk if necessary) can proceed in parallel. */

    var putSuccess = false
    val droppedBlocks = new ArrayBuffer[(BlockId, BlockStatus)]
    //并发同步，判断内存大小
    accountingLock.synchronized {
      //保证有可用的空间，该方法判断当前内存不足以存储当前数据，
      //那么同步entries那么移除一部分可以写到磁盘的数据，那么移除数据到磁盘
      //但是如果被移除的数据没有指定可以写到磁盘，那么此数据就丢了
      //移除的过程中，由于entries是一个linkedHashMap，所以移除的顺序是有限移除旧的entry
      val freeSpaceResult = ensureFreeSpace(blockId, size)
      val enoughFreeSpace = freeSpaceResult.success
      droppedBlocks ++= freeSpaceResult.droppedBlocks
       //首先调用enoughFreeSpace方法判断内存是否够用
      if (enoughFreeSpace) {
        //实际放入的数据封装在MemoryEntry中
        val entry = new MemoryEntry(value, size, deserialized)
        entries.synchronized {
          //将新的数据entry放入到entries中，并将blockID与该entry对应
          entries.put(blockId, entry)
          currentMemory += size
        }
        val valuesOrBytes = if (deserialized) "values" else "bytes"
        logInfo("Block %s stored as %s in memory (estimated size %s, free %s)".format(
          blockId, valuesOrBytes, Utils.bytesToString(size), Utils.bytesToString(freeMemory)))
        putSuccess = true
      } else {
        //如果删除其他的数据还是不能放入数据，那么写入磁盘
        // Tell the block manager that we couldn't put it in memory so that it can drop it to
        // disk if the block allows disk storage.
        val data = if (deserialized) {
          Left(value.asInstanceOf[Array[Any]])
        } else {
          Right(value.asInstanceOf[ByteBuffer].duplicate())
        }
        val droppedBlockStatus = blockManager.dropFromMemory(blockId, data)
        droppedBlockStatus.foreach { status => droppedBlocks += ((blockId, status)) }
      }
    }
    ResultWithDroppedBlocks(putSuccess, droppedBlocks)
  }

如果是diskStore，则直接使用javaIO流写入磁盘。

数据的多副本操作定义如下：

    while (!done) {
      getRandomPeer() match {
        case Some(peer) =>
          try {
            val onePeerStartTime = System.currentTimeMillis
            data.rewind()
            logTrace(s"Trying to replicate $blockId of ${data.limit()} bytes to $peer")
            //将数据异步写入其他的blockmanager上
            blockTransferService.uploadBlockSync(
              peer.host, peer.port, peer.executorId, blockId, new NioManagedBuffer(data), tLevel)
            logTrace(s"Replicated $blockId of ${data.limit()} bytes to $peer in %s ms"
              .format(System.currentTimeMillis - onePeerStartTime))
            peersReplicatedTo += peer
            peersForReplication -= peer
            replicationFailed = false
            if (peersReplicatedTo.size == numPeersToReplicateTo) {
              done = true  // specified number of peers have been replicated to
            }
          } catch {
            case e: Exception =>
              logWarning(s"Failed to replicate $blockId to $peer, failure #$failures", e)
              failures += 1
              replicationFailed = true
              peersFailedToReplicateTo += peer
              if (failures > maxReplicationFailures) { // too many failures in replcating to peers
                done = true
              }
          }
        case None => // no peer left to replicate to
          done = true
      }
    }

你可能感兴趣的:(spark,spark源码分析)

实时数据流计算引擎Flink和Spark剖析程小舰 flink spark 数据库 kafka hadoop
在过去几年，业界的主流流计算引擎大多采用SparkStreaming，随着近两年Flink的快速发展，Flink的使用也越来越广泛。与此同时，Spark针对SparkStreaming的不足，也继而推出了新的流计算组件。本文旨在深入分析不同的流计算引擎的内在机制和功能特点，为流处理场景的选型提供参考。（DLab数据实验室w.x.公众号出品）一.SparkStreamingSparkStreamin
Spark SQL架构及高级用法 Aurora_NeAr spark sql 架构
SparkSQL架构概述架构核心组件API层（用户接口）输入方式：SQL查询；DataFrame/DatasetAPI。统一性：所有接口最终转换为逻辑计划树（LogicalPlan），进入优化流程。编译器层（Catalyst优化器）核心引擎：基于规则的优化器（Rule-BasedOptimizer,RBO）与成本优化器（Cost-BasedOptimizer,CBO）。处理流程：阶段输入输出关键动
Hive详解
一：Hive的历史价值1，Hive是Hadoop上的KillerApplication，Hive是Hadoop上的数据仓库，Hive同时兼具有数据仓库中的存储引擎和查询引擎的作用；而SparkSQL是一个更加出色和高级的查询引擎，所以在现在企业级应用中SparkSQL+Hive成为了业界使用大数据最为高效和流行的趋势。2，Hive是Facebook的推出，主要是为了让不动Java代码编程的人员也能
全面对比，深度解析 Ignite 与 Spark xaio7biancheng
经常有人拿Ignite和Spark进行比较，然后搞不清两者的区别和联系。Ignite和Spark，如果笼统归类，都可以归于内存计算平台，然而两者功能上虽然有交集，并且Ignite也会对Spark进行支持，但是不管是从定位上，还是从功能上来说，它们差别巨大，适用领域有显著的区别。本文从各个方面对此进行对比分析，供各位技术选型参考。一、综述Ignite和Spark都为Apache的顶级开源项目，遵循A
ignite redis_全面对比，深度解析 Ignite 与 Spark weixin_39997696 ignite redis
经常有人拿Ignite和Spark进行比较，然后搞不清两者的区别和联系。Ignite和Spark，如果笼统归类，都可以归于内存计算平台，然而两者功能上虽然有交集，并且Ignite也会对Spark进行支持，但是不管是从定位上，还是从功能上来说，它们差别巨大，适用领域有显著的区别。本文从各个方面对此进行对比分析，供各位技术选型参考。一、综述Ignite和Spark都为Apache的顶级开源项目，遵循A
数据写入因为汉字引发的异常 qq_40841339 spark hadoop hive hive hadoop 数据仓库
spark数据写hive表，发生查询分区异常问题异常：251071241926.49ERRORHive:MelaException(message.Exceptionthrownwhenexeculingquey.SELECTDISTINCT‘orgapache.hadop.hivemelastore.modelMpartionAs"NUCLEUSTYPE,AONCREATETIME,AO.LAS
语言合成模型Spark-TTS-0.5B学习笔记 tutgxuzyj spark 学习笔记
语言合成模型Spark-TTS-0.5B学习笔记语言合成是通过计算机技术将文字信息转换为自然流畅的语音输出，模拟人类语音。一、下载Spark-TTS-0.5B项目下载链接：https://github.com/SparkAudio/Spark-TTS.git注：需要科学网络。进入Spark-TTS文件夹，启动命令行窗口。创建Conda环境：condacreate-nsparktts-ypython
Spark-TTS 使用时间自由 AI 人工智能
1.开发背景上一章节使用了MegaTTS3实现文本转语音，但是后面才发现只能使用官方的语言包，没看到克隆功能，所以重新找了一个可以克隆语音的开源模型。2.开发需求在Ubuntu下实现Spark-TTS的部署，实现官方语音克隆，根据自定义文本输出语音。3.开发环境Ubuntu20.04+Conda+Spark-TTS+RTX5060TI4.实现步骤4.1安装环境#创建环境python版本建议3.10
Spark 的监控和性能调优高度依赖其内置的工具：【 Spark Web UI 和 Spark History Server】 csdn_tom_168 大数据 spark 大数据核心监控性能调优工具
Spark的监控和性能调优高度依赖其内置的SparkWebUI和SparkHistoryServer。它们是诊断作业性能瓶颈、资源利用率、错误原因和优化机会的最重要工具。一、SparkWebUI(DriverWebUI)当一个Spark应用程序(SparkContext)运行时，Driver进程会启动一个Web服务器，默认端口是4040(如果4040被占用，则尝试4041,4042等)。这是实时监
黑猴子的家：Spark RDD 编程进阶之广播变量黑猴子的家
广播变量用来高效分发较大的对象。向所有工作节点发送一个较大的只读值，以供一个或多个Spark操作使用。比如，如果你的应用需要向所有节点发送一个较大的只读查询表，甚至是机器学习算法中的一个很大的特征向量，广播变量用起来都很顺手。传统方式下，Spark会自动把闭包中所有引用到的变量发送到工作节点上。虽然这很方便，但也很低效。原因有二:首先，默认的任务发射机制是专门为小任务进行优化的；其次，事实上你可能
开源项目ESP-SparkBot: ESP32-S3 大模型 AI 桌面机器人（复刻分享） Qsm_lambda 机器人 ai AI编程
一、前言ESP-SparkBot是官方大佬，乐鑫小铁匠开源在立创开源硬件平台的项目，此贴是用于分享与记录复刻过程。开源地址：(ESP-SparkBot-立创开源硬件平台(oshwhub.com))千人讨论Q群362367052二、项目简介ESP-SparkBot是⼀款基于ESP32-S3，集成语⾳交互、图像识别、遥控操作和多媒体功能于⼀体的智能设备。它不仅可以通过语⾳助⼿实现
数据科学与大数据技术专业的核心课程体系及发展路径全解析 YangYang9YangYan 大数据
CDA数据分析师证书含金量高，适应了未来数字化经济和AI发展趋势，难度不高，行业认可度高，对于找工作很有帮助。一、课程体系三维地图二、核心课程能力矩阵课程模块关键技能行业应用场景工具链分布式计算Spark调优用户行为日志分析AWSEMR/Databricks数据挖掘特征工程金融反欺诈模型Scikit-learn实时数据处理Flink窗口计算物联网设备监控Kafka+Flink数据治理元数据管理企业
SpringBoot与ApacheSpark、MyBatis实战整合 KENYCHEN奉孝 spring实站大全 java 开发语言 mybatis spring
基于SpringBoot和ApacheSpark开发的实例以下是基于SpringBoot和ApacheSpark整合开发的实用示例分类及关键点，涵盖数据处理、机器学习、实时分析等场景。每个示例均提供核心思路和代码片段（Markdown格式）。数据处理与ETL示例1：CSV文件读取与处理SparkSessionspark=SparkSession.builder().appName("CSVProc
INVALID_COLUMN_NAME _AS_PATH
sparksql异常[INVALID_COLUMN_NAME_AS_PATH]ThedatasourceHiveFileFormatcannotsavethecolumnmin(birth_date)becauseitsnamecontainssomecharactersthatarenotallowedinfilepaths.Piease,useanallastorenameidemosqlSE
Hive/Spark小文件解决方案(企业级实战)–参数和SQL优化陆水A 大数据 hive hadoop spark python
重点是后面的参数优化一、小文件的定义在Hadoop的上下文中，小文件的定义是相对于Hadoop分布式文件系统（HDFS）的块（Block）大小而言的。HDFS是Hadoop生态系统中的核心组件之一，它设计用于存储和处理大规模数据集。在HDFS中，数据被分割成多个块，每个块的大小是固定的，这个大小在Hadoop的不同版本和配置中可能有所不同，但常见的默认块大小包括128MB、256MB等。基于这个背
Spark核心--RDD介绍陆水A 大数据 spark 大数据分布式
一、RDD的介绍rdd弹性分布式数据集是spark框架自己封装的数据类型，用来管理内存数据数据集：rdd数据的格式类似Python中[]。hive中的该结构[]叫数组rdd提供算子(方法)方便开发人员进行调用计算数据在pysaprk中本质是定义一个rdd类型用来管理和计算内存数据分布式：rdd可以时使用多台机器的内存资源完成计算弹性：可以通过分区将数据分成多份234，每份数据对应一个task线程处
C++与Hive、Spark、libhdfs、ACID交互技巧 KENYCHEN奉孝 C++开发语言 spring C++hive spark
C++与Hive交互的实例以下是C++与Hive交互的实例代码片段，涵盖连接、查询、数据操作等常见场景。假设使用libhdfs或thrift接口实现，部分示例需要结合Hive环境配置。基础连接与查询示例1：通过Thrift连接HiveServer2#include#include#includeusingnamespaceapache::thrift;usingnamespaceapache::h
全面的Spark学习资料合集：从基础到高级应用
本文还有配套的精品资源，点击获取简介：Spark是一个受到数据科学界青睐的大数据处理框架，以其高效、易用和可扩展性著称。本资料合集包括了Spark的基础学习材料、实战案例分析和高级应用实践，内容覆盖从Scala编程语言基础到Spark核心功能使用，再到大数据领域的实际应用。适合不同层次的学习者深入学习Spark，无论是初学者还是有经验的开发者，都能从中找到有价值的学习资源，帮助理解和掌握Spark
一文带你理清Spark Core调优的方方面面即将秃头的Java程序员
前言本文的注意事项观看本文前，可以先百度搜索一下Spark程序的十大开发原则看看哦文章虽然很长，可并不是什么枯燥乏味的内容，而且都是面试时的干货（我觉得）可以结合PC端的目录食用，可以直接跳转到你想要的那部分内容图非常的重要，是文章中最有价值的部分。如果不是很重要的图一般不会亲手画，特别是本文2.2.6的图非常重要此文会很大程度上借鉴美团的文章分享内容和Spark官方资料去进行说明，也会结合笔者自
AI系统Spark原理与代码实战案例讲解 AI天才研究院 AI大模型企业级应用开发实战 Agentic AI 实战 AI人工智能与大数据计算科学神经计算深度学习神经网络大数据人工智能大型语言模型 AI AGI LLM Java Python 架构设计 Agent RPA
AI系统Spark原理与代码实战案例讲解作者：禅与计算机程序设计艺术/ZenandtheArtofComputerProgramming关键词：Spark、大数据处理、分布式计算、机器学习、数据挖掘、实时流处理1.背景介绍1.1问题的由来在大数据时代,海量数据的高效处理和分析已成为各行各业的迫切需求。传统的数据处理方式难以应对数据量激增、数据类型多样化以及实时性要求高等挑战。为了解决这些问题,Ap
Spark大数据处理讲课笔记4.8 Spark SQL典型案例酒城译痴无心剑 #Spark基础学习笔记（1）spark 笔记 sql
文章目录零、本讲学习目标一、使用SparkSQL实现词频统计（一）提出任务（二）实现任务1、准备数据文件2、创建Maven项目3、修改源程序目录4、添加依赖和设置源程序目录5、创建日志属性文件6、创建HDFS配置文件7、创建词频统计单例对象8、启动程序，查看结果9、词频统计数据转化流程图二、使用SparkSQL计算总分与平均分（一）提出任务（二）完成任务1、准备数据文件2、新建Maven项目3、修
手撕Spark之WordCount RDD执行流程啊Abu Spark spark
手撕Spark之WordCountRDD执行流程文章目录手撕Spark之WordCountRDD执行流程写在前面软件环境代码过程分析写在前面一个Spark程序在初始化的时候会构造DAGScheduler、TaskSchedulerImpl、MapOutTrackerMaster等对象，DAGScheduler主要负责生成DAG、启动Job、提交Stage等操作，TaskSchedulerImpl主
【大数据学习 | Spark-Core】RDD的概念与Spark任务的执行流程 Vez'nan的幸福生活大数据 spark oracle sql json
1.RDD的设计背景在实际应用中，存在许多迭代式计算，这些应用场景的共同之处是，不同计算阶段之间会重用中间结果，即一个阶段的输出结果会作为下一个阶段的输入。但是，目前的MapReduce框架都是把中间结果写入到HDFS中，带来了大量的数据复制、磁盘IO和序列化开销。显然，如果能将结果保存在内存当中，就可以大量减少IO。RDD就是为了满足这种需求而出现的，它提供了一个抽象的数据架构，我们不必担心底层
第84课：StreamingContext、DStream、Receiver深度剖析 chengnidi5193
StreamingContext、DStream、Receiver深度剖析编写人：姜伟、唐陈昊、龚湄燕本课分成四部分讲解，第一部分对StreamingContext功能及源码剖析；第二部分对DStream功能及源码剖析；第三部分对Receiver功能及源码剖析；最后一部分将StreamingContext、DStream、Receiver结合起来分析其流程。1、通过SparkStreaming对象
Hbase BulkLoad用法 kikiki2
要导入大量数据，Hbase的BulkLoad是必不可少的，在导入历史数据的时候，我们一般会选择使用BulkLoad方式，我们还可以借助Spark的计算能力将数据快速地导入。使用方法导入依赖包compilegroup:'org.apache.spark',name:'spark-sql_2.11',version:'2.3.1.3.0.0.0-1634'compilegroup:'org.apach
Python 大数据分析（二）绝不原创的飞龙默认分类默认分类
原文：annas-archive.org/md5/5058e6970bd2a8d818ecc1f7f8fef74a译者：飞龙协议：CCBY-NC-SA4.0第六章：第五章处理缺失值和相关性分析学习目标到本章结束时，你将能够：使用PySpark检测和处理数据中的缺失值描述变量之间的相关性计算PySpark中两个或多个变量之间的相关性使用PySpark创建相关矩阵在本章中，我们将使用Iris数据集处理
DolphinScheduler 如何高效调度 AnalyticDB on Spark 作业？ DolphinScheduler社区 spark 大数据分布式
DolphinScheduler是一个分布式易扩展的可视化DAG工作流任务调度开源系统，能高效地执行和管理大数据流程。用户可以在DolphinSchedulerWeb界面轻松创建、编辑和调度云原生数据仓库AnalyticDBMySQL版的Spark作业。前提条件AnalyticDBforMySQL集群的产品系列为企业版、基础版或湖仓版。AnalyticDBforMySQL集群中已创建Job型资源组
【Spark征服之路-3.7-Spark-SQL核心编程（六）】 qq_46394486 spark sql ajax
数据加载与保存：通用方式：SparkSQL提供了通用的保存数据和数据加载的方式。这里的通用指的是使用相同的API，根据不同的参数读取和保存不同格式的数据，SparkSQL默认读取和保存的文件格式为parquet加载数据：spark.read.load是加载数据的通用方法。如果读取不同格式的数据，可以对不同的数据格式进行设定。spark.read.format("…")[.option("…")].
深入解析 Spark：关键问题与答案汇总 ※尘 sql hive spark
在大数据处理领域，Spark凭借其高效的计算能力和丰富的功能，成为了众多开发者和企业的首选框架。然而，在使用Spark的过程中，我们会遇到各种各样的问题，从性能优化到算子使用等。本文将围绕Spark的一些核心问题进行详细解答，帮助大家更好地理解和运用Spark。Spark性能优化策略Spark性能优化是提升作业执行效率的关键，主要可以从以下几个方面入手：首先，资源配置优化至关重要。合理设置Exec
spark on yarn 不辉放弃 pyspark 大数据开发
SparkonYARN是指将Spark应用程序运行在HadoopYARN集群上，借助YARN的资源管理和调度能力来管理Spark的计算资源。这种模式能充分利用现有Hadoop集群资源，简化集群管理，是企业中常用的Spark部署方式。核心角色•Spark应用：包含Driver进程和Executor进程。Driver负责任务调度、逻辑处理；Executor负责执行具体任务并存储数据。•YARN组件：◦
ASM系列四利用Method 组件动态注入方法逻辑 lijingyao8206 字节码技术 jvm AOP 动态代理 ASM
这篇继续结合例子来深入了解下Method组件动态变更方法字节码的实现。通过前面一篇，知道ClassVisitor 的visitMethod()方法可以返回一个MethodVisitor的实例。那么我们也基本可以知道，同ClassVisitor改变类成员一样，MethodVIsistor如果需要改变方法成员，注入逻辑，也可以
java编程思想 --内部类百合不是茶 java 内部类匿名内部类
内部类;了解外部类并能与之通信内部类写出来的代码更加整洁与优雅 1,内部类的创建内部类是创建在类中的 package com.wj.InsideClass; /* * 内部类的创建 */ public class CreateInsideClass { public CreateInsideClass(
web.xml报错 crabdave web.xml
web.xml报错 The content of element type "web-app" must match "(icon?,display- name?,description?,distributable?,context-param*,filter*,filter-mapping*,listener*,servlet*,s
泛型类的自定义麦田的设计者 java android 泛型
为什么要定义泛型类，当类中要操作的引用数据类型不确定的时候。采用泛型类，完成扩展。例如有一个学生类 Student{ Student(){ System.out.println("I'm a student....."); } } 有一个老师类
CSS清除浮动的4中方法 IT独行者 JavaScript UI css
清除浮动这个问题，做前端的应该再熟悉不过了，咱是个新人，所以还是记个笔记，做个积累，努力学习向大神靠近。CSS清除浮动的方法网上一搜，大概有N多种，用过几种，说下个人感受。 1、结尾处加空div标签 clear:both 1 2 3 4 .div 1 { background : #000080 ; border : 1px s
Cygwin使用windows的jdk 配置方法 _wy_ jdk windows cygwin
1.[vim /etc/profile] JAVA_HOME="/cgydrive/d/Java/jdk1.6.0_43" (windows下jdk路径为D:\Java\jdk1.6.0_43) PATH="$JAVA_HOME/bin:${PATH}" CLAS
linux下安装maven 无量 maven linux 安装
Linux下安装maven(转) 1.首先到Maven官网下载安装文件，目前最新版本为3.0.3，下载文件为 apache-maven-3.0.3-bin.tar.gz，下载可以使用wget命令； 2.进入下载文件夹，找到下载的文件，运行如下命令解压 tar -xvf apache-maven-2.2.1-bin.tar.gz 解压后的文件夹
tomcat的https 配置,syslog-ng配置 aichenglong tomcat http跳转到https syslong-ng配置 syslog配置
1) tomcat配置https,以及http自动跳转到https的配置 1)TOMCAT_HOME目录下生成密钥(keytool是jdk中的命令) keytool -genkey -alias tomcat -keyalg RSA -keypass changeit -storepass changeit
关于领号活动总结 alafqq 活动
关于某彩票活动的总结具体需求，每个用户进活动页面，领取一个号码，1000中的一个；活动要求 1，随机性，一定要有随机性； 2，最少中奖概率，如果注数为3200注，则最多中4注 3，效率问题，（不能每个人来都产生一个随机数，这样效率不高）； 4，支持断电（仍然从下一个开始），重启服务；（存数据库有点大材小用，因此不能存放在数据库）解决方案 1，事先产生随机数1000个，并打
java数据结构冒泡排序的遍历与排序百合不是茶 java
java的冒泡排序是一种简单的排序规则冒泡排序的原理：比较两个相邻的数，首先将最大的排在第一个，第二次比较第二个，此后一样；针对所有的元素重复以上的步骤，除了最后一个例题；将int array[]
JS检查输入框输入的是否是数字的一种校验方法 bijian1013 js
如下是JS检查输入框输入的是否是数字的一种校验方法： <form method=post target="_blank"> 数字：<input type="text" name=num onkeypress="checkNum(this.form)"><br> </form>
Test注解的两个属性：expected和timeout bijian1013 java JUnit expected timeout
JUnit4：Test文档中的解释：　　The Test annotation supports two optional parameters. 　　The first, expected, declares that a test method should throw an exception. 　　If it doesn't throw an exception or if it
[Gson二]继承关系的POJO的反序列化 bit1129 POJO
父类 package inheritance.test2; import java.util.Map; public class Model { private String field1; private String field2; private Map<String, String> infoMap
【Spark八十四】Spark零碎知识点记录 bit1129 spark
1. ShuffleMapTask的shuffle数据在什么地方记录到MapOutputTracker中的 ShuffleMapTask的runTask方法负责写数据到shuffle map文件中。当任务执行完成成功，DAGScheduler会收到通知，在DAGScheduler的handleTaskCompletion方法中完成记录到MapOutputTracker中
WAS各种脚本作用大全 ronin47 WAS 脚本
　　　http://www.ibm.com/developerworks/cn/websphere/library/samples/SampleScripts.html 　　　无意中，在WAS官网上发现的各种脚本作用，感觉很有作用，先与各位分享一下　　　获取下载这些示例 jacl 和 Jython 脚本可用于在 WebSphere Application Server 的不同版本中自
java-12.求 1+2+3+..n不能使用乘除法、 for 、 while 、 if 、 else 、 switch 、 case 等关键字以及条件判断语句 bylijinnan switch
借鉴网上的思路，用java实现： public class NoIfWhile { /** * @param args * * find x=1+2+3+....n */ public static void main(String[] args) { int n=10; int re=find(n); System.o
Netty源码学习-ObjectEncoder和ObjectDecoder bylijinnan java netty
Netty中传递对象的思路很直观： Netty中数据的传递是基于ChannelBuffer（也就是byte[]）；那把对象序列化为字节流，就可以在Netty中传递对象了相应的从ChannelBuffer恢复对象，就是反序列化的过程 Netty已经封装好ObjectEncoder和ObjectDecoder 先看ObjectEncoder ObjectEncoder是往外发送
spring 定时任务中cronExpression表达式含义 chicony cronExpression
一个cron表达式有6个必选的元素和一个可选的元素，各个元素之间是以空格分隔的，从左至右，这些元素的含义如下表所示：代表含义是否必须允许的取值范围 &nb
Nutz配置Jndi ctrain JNDI
1、使用JNDI获取指定资源： var ioc = { dao : { type :"org.nutz.dao.impl.NutDao", args : [ {jndi :"jdbc/dataSource"} ] } } 以上方法,仅需要在容器中配置好数据源,注入到NutDao即可.
解决 /bin/sh^M: bad interpreter: No such file or directory daizj shell
在Linux中执行.sh脚本，异常/bin/sh^M: bad interpreter: No such file or directory。分析：这是不同系统编码格式引起的：在windows系统中编辑的.sh文件可能有不可见字符，所以在Linux系统下执行会报以上异常信息。解决： 1）在windows下转换：利用一些编辑器如UltraEdit或EditPlus等工具
[转]for 循环为何可恨？ dcj3sjt126com 程序员读书
Java的闭包(Closure)特征最近成为了一个热门话题。一些精英正在起草一份议案，要在Java将来的版本中加入闭包特征。然而，提议中的闭包语法以及语言上的这种扩充受到了众多Java程序员的猛烈抨击。不久前，出版过数十本编程书籍的大作家Elliotte Rusty Harold发表了对Java中闭包的价值的质疑。尤其是他问道“for 循环为何可恨？”[http://ju
Android实用小技巧 dcj3sjt126com android
1、去掉所有Activity界面的标题栏　　修改AndroidManifest.xml 　　在application 标签中添加android:theme="@android:style/Theme.NoTitleBar" 2、去掉所有Activity界面的TitleBar 和StatusBar 　　修改AndroidManifes
Oracle 复习笔记之序列 eksliang Oracle 序列 sequence Oracle sequence
转载请出自出处：http://eksliang.iteye.com/blog/2098859 1.序列的作用序列是用于生成唯一、连续序号的对象一般用序列来充当数据库表的主键值 2.创建序列语法如下： create sequence s_emp start with 1 --开始值 increment by 1 --増长值 maxval
有“品”的程序员 gongmeitao 工作
完美程序员的10种品质　　完美程序员的每种品质都有一个范围，这个范围取决于具体的问题和背景。没有能解决所有问题的完美程序员（至少在我们这个星球上），并且对于特定问题，完美程序员应该具有以下品质：　　1. 才智非凡- 能够理解问题、能够用清晰可读的代码翻译并表达想法、善于分析并且逻辑思维能力强（范围：用简单方式解决复杂问题）　　
使用KeleyiSQLHelper类进行分页查询 hvt sql .net C#asp.net hovertree
本文适用于sql server单主键表或者视图进行分页查询，支持多字段排序。KeleyiSQLHelper类的最新代码请到http://hovertree.codeplex.com/SourceControl/latest下载整个解决方案源代码查看。或者直接在线查看类的代码：http://hovertree.codeplex.com/SourceControl/latest#HoverTree.D
SVG 教程（三）圆形，椭圆，直线天梯梦 svg
SVG <circle> SVG 圆形 - <circle> <circle> 标签可用来创建一个圆：下面是SVG代码： <svg xmlns="http://www.w3.org/2000/svg" version="1.1"> <circle cx="100" c
链表栈 luyulong java 数据结构
public class Node { private Object object; private Node next; public Node() { this.next = null; this.object = null; } public Object getObject() { return object; } public
基础数据结构和算法十：2-3 search tree sunwinner Algorithm 2-3 search tree
Binary search tree works well for a wide variety of applications, but they have poor worst-case performance. Now we introduce a type of binary search tree where costs are guaranteed to be loga
spring配置定时任务 stunizhengjia spring timer
最近因工作的需要，用到了spring的定时任务的功能,觉得spring还是很智能化的,只需要配置一下配置文件就可以了,在此记录一下，以便以后用到： //------------------------定时任务调用的方法------------------------------ /** * 存储过程定时器 */ publi
ITeye 8月技术图书有奖试读获奖名单公布 ITeye管理员活动
ITeye携手博文视点举办的8月技术图书有奖试读活动已圆满结束，非常感谢广大用户对本次活动的关注与参与。 8月试读活动回顾： http://webmaster.iteye.com/blog/2102830 本次技术图书试读活动的优秀奖获奖名单及相应作品如下（优秀文章有很多，但名额有限，没获奖并不代表不优秀）：《跨终端Web》 gleams：http