Machine Learning Methods: Decision trees and forests

This post contains our crib notes on the basics of decision trees and forests. We first discuss the construction of individual trees, and then introduce random and boosted forests. We also discuss efficient implementations of greedy tree construction algorithms, showing that a single tree can be constructed in O(k×nlogn) time, given n training examples having k features each. We provide exercises on interesting related points and an appendix containing relevant python/sk-learn function calls.

Follow @efavdb
Follow us on twitter for new submission alerts!

Introduction

Decision trees constitute a class of simple functions that are frequently used for carrying out regression and classification. They are constructed by hierarchically splitting a feature space into disjoint regions, where each split divides into two one of the already existing regions. In most common implementations, the splits are always taken along one of the feature axes, which causes the regions to be rectangular in shape. An example is shown in Fig. 1 below. In this example, a two-dimensional feature space is first split by a tree on f1 — one of the two features characterizing the space — at value sa . This separates the space into two sets, that where f1<sa and that where f1≥sa . Next, the tree further splits the first of these sets on feature f2 at value sb . With these combined splits, the tree partitions the space into three disjoint regions, labeled R1,R2, and R3 , where, e.g., R1={f|f1<sa,f2<sb} .

Once a decision tree is constructed, it can be used for making predictions on unlabeled feature vectors — i.e., points in feature space not included in our training set. This is done by first deciding which of the regions a new feature vector belongs to, and then returning as its hypothesis label an average over the training example labels within that region: The mean of the region’s training labels is returned for regression problems and the mode for classification problems. For instance, the tree in Fig. 1 would return an average of the five training examples in R1 (represented by red dots) when asked to make a hypothesis for any and all other points in that region.

The art and science of tree construction is in deciding how many splits should be taken and where those splits should take place. The goal is to find a tree that provides a reasonable, piece-wise constant approximation to the underlying distribution or function that has generated the training data provided. This can be attempted through choosing a tree that breaks space up into regions such that the examples in any given region have identical — or at least similar — labels. We discuss some common approaches to finding such trees in the next section.

Individual trees have the important benefit of being easy to interpret and visualize, but they are often not as accurate as other common machine learning algorithms. However, individual trees can be used as simple building blocks with which to construct more complex, competitive models. In the third section of this note, we discuss three very popular constructions of this sort: bagging, random forests (a variant on bagging), and boosting. We then discuss the runtime complexity of tree/forest construction and conclude with a summary, exercises, and an appendix containing example python code.

Constructing individual decision trees

REGRESSION

Regression tree construction typically proceeds by attempting to minimize a squared error cost function: Given a training set T≡{tj=(fj,yj)} of feature vectors and corresponding real-valued labels, this is given by

J = \sum R i \sum t j \in R i (y ¯ ¯ ¯ R i - y j) 2, (1)

where

y¯¯¯Ri is the mean training label in region

Ri . This mean training label is the hypothesis returned by the tree for all points in

Ri , including its training examples. Therefore, (

1 ) is a measure of the accuracy of the tree as applied to the training set.

Unfortunately, actually minimizing ( 1 ) over any large subset of trees can be a numerically challenging task. This is true whenever you have a large number of features or training examples. Consequently, different approximate methods are generally taken to find good candidate trees. Two typical methods follow:

Greedy algorithm: The tree is constructed recursively, one branching step at a time. At each step, one takes the split that will most significantly reduce the cost function J , relative to its current value. In this way, after k−1 splits, a tree with k regions (leaves) is obtained — Fig. 2 provides an illustration of this process. The algorithm terminates whenever some specified stopping criterion is satisfied, examples of which are given below.
Randomized algorithm: Randomized tree-search protocols can sometimes find global minima inaccessible to the gradient-descent-like greedy algorithm. These randomized protocols also proceed recursively. However, at each step, some randomization is introduced by hand. For example, one common approach is to select r candidate splits through random sampling at each branching point. The candidate split that most significantly reduces J is selected, and the process repeats. The benefit of this approach is that it can sometimes find paths that appear suboptimal in their first few steps, but are ultimately favorable.

CLASSIFICATION

In classification problems, the training labels take on a discrete set of values, often having no numerical significance. This means that a squared-error cost function, like that in ( 1 ) — cannot be directly applied as a useful accuracy score for guiding classification tree construction. Instead, three other cost functions are often considered, each providing a different measure of the class purity of the different regions — that is, they attempt to measure whether or not a given region consists of training examples that are mostly of the same class. These three measures are the error rate ( E ), the Gini index ( G ), and the cross-entropy ( CE ): If we write Ni for the number of training examples in region Ri , and pi,j for the fraction of these that have class label j , then these three cost functions are given by

E G C E = = = \sum R i N i \times (1 - max j p i, j) \sum R i, j N i \times p i, j (1 - p i, j) - \sum R i, j N i \times p i, j log p i, j . (2) (3) (4)

Each of the summands here are plotted in Fig. 3 for the special case of binary classification (two labels only). Each is unfavorably maximized at the most mixed state, where

p1=0.5 , and minimized in the pure states, where

p1=0,1 .

Although E is perhaps the most intuitive of the three measures above (it’s simply the number of training examples misclassified by the tree — this follows from the fact that the tree returns as hypothesis the mode in each region) the latter two have the benefit of being characterized by negative curvature as a function of the pi,j . This property tends to enhance the favorability of splits that generate region pairs where at least one is highly pure. At times, this can simultaneously result in the other region of the pair ending up relatively impure — see Exercise 1 for details. Such moves are often ultimately beneficial, since any highly impure node that results can always be broken up in later splits anyways. The plot in Fig. 3 shows that the cross-entropy has the larger curvature of the two, and so should more highly favor such splits, at least in the binary classification case. Another nice feature of the Gini and cross-entropy functions is that — in contrast to the error rate — they are both smooth functions of the pi,j , which facilitates numerical optimization. For these reasons, one of these two functions is typically used to guide tree construction, even if E is the quantity one would actually like to minimize. Tree construction proceeds as in the regression case, typically by a greedy or randomized construction, each step taken so as to minimize ( 3 ) or ( 4 ), whichever is chosen.

BIAS-VARIANCE TRADE-OFF AND STOPPING CONDITIONS

Decision trees that are allowed to split indefinitely will have low bias but will over-fit their training data. Placing different stopping criteria on a tree’s growth can ameliorate this latter effect. Two typical conditions often used for this purpose are given by a) placing an upper bound on the number of levels permitted in the tree, or b) requiring that each region (tree leaf) retains at least some minimum number of training examples. To optimize over such constraints, one can apply cross-validation.

Bagging, random forests, and boosting

Another approach to alleviating the high-variance, over-fitting issue associated with decision trees is to average over many of them. This approach is motivated by the observation that the sum of N independent random variables — each with variance σ2 — has a relatively reduced variance, σ2/N . Two common methods for carrying out summations of this sort are discussed below.

BAGGING AND RANDOM FORESTS

Bootstrap aggregation, or “bagging”, provides one common method for constructing ensemble tree models. In this approach, one samples with replacement to obtain k separate bootstrapped training sets from the original training data. To obtain a bootstrapped subsample of a data set of size N , one draws randomly from the set N times with replacement. Because one samples with replacement, each bootstrapped set can contain multiple copies of some examples. The average number of unique examples in a given bootstrap is simply N times the probability that any individual example makes it into the training set. This is

N [1 - (N - 1 N) N] \approx N (1 - e - 1) \approx 0.63 N, (5)

where the latter forms are accurate in the large

N limit. Once the bootstrapped data sets are constructed, an individual decision tree is fit to each, and an average or majority rule vote over the full set is used to provide the final prediction.

One nice thing about bagging methods, in general, is that one can train on the entire set of available labeled training data and still obtain an estimate of the generalization error. Such estimates are obtained by considering the error on each point in the training set, in each case averaging only over those trees that did not train on the point in question. The resulting estimate, called the out-of-bag error, typically provides a slight overestimate to the generalization error. This is because accuracy generally improves with growing ensemble size, and the full ensemble is usually about three times larger than the sub-ensemble used to vote on any particular training example in the out-of-bag error analysis.

Random forests provide a popular variation on the bagging method. The individual decision trees making up a random forest are, again, each fit to an independent, bootstrapped subsample of the training data. However, at each step in their recursive construction process, these trees are restricted in that they are only allowed to split on r randomly selected candidate feature directions; a new set of r directions is chosen at random for each step in the tree construction. These restrictions serve to effect a greater degree of independence in the set of trees averaged over in a random forest, which in turn serves to reduce the ensemble’s variance — see Exercise 5 for related analysis. In general, the value of r should be optimized through cross-validation.

BOOSTING

The final method we’ll discuss is boosting, which again consists of a set of individual trees that collectively determine the ultimate prediction returned by the model. However, in the boosting scenario, one fits each of the trees to the full data set, rather than to a small sample. Because they are fit to the full data set, these trees are usually restricted to being only two or three levels deep, so as to avoid over-fitting. Further, the individual trees in a boosted forest are constructed sequentially. For instance, in regression, the process typically works as follows: In the first step, a tree is fit to the full, original training set T={ti=(fi,yi)} . Next, a second tree is constructed on the same training feature vectors, but with the original labels replaced by residuals. These residuals are obtained by subtracting out a scaled version of the predictions y^1 returned by the first tree,

y (1) i \equiv y i - α y^1 i . (6)

Here,

α is the scaling factor, or learning rate — choosing its value small results in a gradual learning process, which often leads to very good predictions. Once the second tree is constructed, a third tree is fit to the new residuals, obtained by subtracting out the scaled hypothesis of the second tree,

y(2)i≡y(1)i–αy^2i . The process repeats until

m trees are constructed, with their

α -scaled hypotheses summing to a good estimate to the underlying function.

Boosted classification tree ensembles are constructed in a fashion similar to that above. However, in contrast to the regression scenario, the same, original training labels are used to fit each new tree in the ensemble (as opposed to an evolving residual). To bring about a similar, gradual learning process, boosted classification ensembles instead sample from the training set with weights that are sample-dependent and that change over time: When constructing a new tree for the ensemble, one more heavily weights those examples that have been poorly fit in prior iterations. AdaBoost is a popular algorithm for carrying out boosted classification. This and other generalizations are covered in the text Elements of Statistical Learning.

Implementation runtime complexity

Before concluding, we take here a moment to consider the runtime complexity of tree construction. This exercise gives one a sense of how tree algorithms are constructed in practice. We begin by considering the greedy construction of a single classification tree. The extension to regression trees is straightforward.

INDIVIDUAL DECISION TREES

Consider the problem of greedily training a single classification tree on a set of n training examples having k features. In order to construct our tree, we take as a first step the sorting of the n training vectors along each of the k directions, which will facilitate later optimal split searches. Recall that optimized algorithms, e.g. merge-sort, require O(nlogn) time to sort along any one feature direction, so sorting along all k will require O(k×nlogn) time. After this pre-sort step is complete, we must seek the currently optimal split, carry it out, and then iterate. We will show that — with care — the full iterative process can also be carried out in O(k×nlogn) time.

Focus on an intermediate moment in the construction process where one particular node has just been split, resulting in two new regions, R1 and R2 containing nR1 and nR2 training examples, respectively. We can assume that we have already calculated and stored the optimal split for every other region in the tree during prior iterations. Therefore, to determine which region contains the next optimal split, the only new searches we need to carry out are within regions R1 and R2 . Focus on R1 and suppose that we have been passed down the following information characterizing it: the number of training examples of each class that it contains, its total number of training examples nR1 , its cost function value J (cross entropy, say), and for each of the k feature directions, a separate list of the region’s examples, sorted along that direction. To find the optimal split, we must consider all k×(nR1−1) possible cuts of this region [Aside: We must check all possible cuts because the cost function can have many local minima. The precludes the use of gradient-descent-like algorithms to find the optimal split.], evaluating the cost function reduction for each.

The left side of Fig. 4 illustrates one method for efficiently carrying out these test cuts: For each feature direction, we proceed sequentially through that direction’s ordered list, considering one cut at a time. In the first cut, we take only one example in the left sub-region induced, and all others on the right. In the second cut, we have the first two examples in the left sub-region, etc. Proceeding in this way, it turns out that the cost function of each new candidate split considered can always be evaluated in O(1) time. This is because we start with knowledge of the cost function J before any cut is taken, and the cost functions we consider here can each be updated in O(1) time whenever only a single example is either added to or removed from a given region — see exercises 3 and 4 for details. Using this approach, we can therefore try all possible cuts of region R1 in O(k×nR1) time.

The above analysis gives the time needed to search for the optimal split within R1 , and a similar form holds for R2 . Once these are determined, we can quickly select the current, globally-optimal split [Aside: Using a heap data structure, the global minimum can be obtained in at most O(logn) time. Summing this effort over all nodes of the tree will lead to roughly O(nlogn) evaluations.]. Carrying out this split entails partitioning the region selected into two and passing the necessary information down to each. We leave as an exercise the fact that the passing of needed information — ordered lists, etc. — can be carried out in O(k×ns) time, with ns the size of the parent region being split. The total tree construction time can now be obtained by summing up each node’s search and split work, which both require O(k×ns ) computations. Assuming a roughly balanced tree having about logn layers — see right side of Fig. 4 — we obtain O(k×nlogn) , the runtime scaling advertised.

In summary, we see that achieving O(k×nlogn) scaling requires a) a pre-sort, b) a data structure for storing certain important facts about each region, including its optimal split, once determined, and also pointers to its parent and children, c) an efficient method for passing relevant information down to daughter regions during a split instance, d) a heap to enable quick selection of the currently optimal split, and e) a cost function that can be updated efficiently under single training example insertions or removals.

FORESTS, PARALLELIZATION

If a forest of N trees is to be constructed, each will require O(k×nlogn) time to construct. Recall, however, that the trees of a bagged forest can be constructed independently of one another. This allows for bagged forest constructions to take advantage of parallelization, facilitating their application in the large N limit. In contrast, the trees of a boosted forest are constructed in sequence and so cannot be parallelized in a similar manner. However, note that optimal split searches along different feature directions can always be run in parallel. This can speed up individual tree construction times in either case.

Discussion

In this note, we’ve quickly reviewed the basics of tree-based models and their constructions. Looking back over what we have learned, we can now consider some of the reasons why tree-based methods are so popular among practitioners. First — and very importantly — individual trees are often useful for gaining insight into the geometry of datasets in high dimensions. This is because tree structures can be visualized using simple diagrams, like that in Fig. 1. In contrast, most other machine learning algorithm outputs cannot be easily visualized — consider, e.g., support-vector machines, which return hyper-plane decision boundaries. A related point is that tree-based approaches are able to automatically fit non-linear decision boundaries. In contrast, linear algorithms can only fit such boundaries if appropriate non-linear feature combinations are constructed. This requires that one first identify these appropriate feature combinations, which can be a challenging task for feature spaces that cannot be directly visualized. Three additional positive qualities of decision trees are given by a) the fact that they are insensitive to feature scale, which reduces the need for related data preprocessing, b) the fact that they can make use of data missing certain feature values, and c) that they are relatively robust against outliers and noisy-labeling issues.

Although boosted and random forests are not as easily visualized as individual decision trees, these ensemble methods are popular because they are often quite competitive. Boosted forests typically have a slightly lower generalization error than their random forest counterparts. For this reason, they are often used when accuracy is highly-valued — see Fig. 5 for an example learning curve consistent with this rule of thumb: Generalization error rate versus training set size for a hand-written digits learning problem. However, the individual trees in a bagged forest can be constructed in parallel. This benefit — not shared by boosted forests — can favor random forests as a go-to, out-of-box approach for treating large-scale machine learning problems.

Exercises follow that detail some further points of interest relating to decision trees and their construction.

REFERENCES

[1] Elements of Statistical Learning, by Hastie, Tibshirani, Friedman
[2] An Introduction to Statistical Learning, by James, Witten, Hastie, and Tibshirani
[3] Random Forests, by Breiman (Machine Learning, 45, 2001).
[4] Sk-learn documentation on runtime complexity, see section 1.8.4.

Exercises

1) JENSEN’S INEQUALITY AND CLASSIFICATION TREE COST FUNCTIONS

a) Consider a real function y(x) with non-positive curvature. Consider sampling y at values {x1,x2,…,xm} . By considering graphically the centroid of the points {(xi,y(xi))} , prove Jensen’s inequality,

y (1 m \sum i x i) \geq 1 m \sum i y (x i) . (7)

When does equality hold?

b) Consider binary tree classification guided by the minimization of the error rate ( 2 ). If all possible cuts of a particular region always leave class 0 in the minority in both resulting sub-regions, will a cut here ever be made?

c) How about if ( 3 ) or ( 4 ) is used as the cost function?

2) DECISION TREE PREDICTION RUNTIME COMPLEXITY

Suppose one has constructed an approximately balanced decision tree, where each node contains one of the n training examples used for its construction. In general, approximately how long will it take to determine the region Ri to which a supplied feature vector belongs? How about for ensemble models? Any difference between typical bagged and boosted forests?

3) CLASSIFICATION TREE CONSTRUCTION RUNTIME COMPLEXITY

a) Consider a region R within a classification tree containing ni training examples of class i , with ∑ini=N . Now, suppose a cut is considered in which a single training example of class 1 is removed from the region. If the region’s cross-entropy before the cut is given by CE0 , show that its entropy after the cut will be given by

C E f = C E 0 - N log (N N - 1) + log (n 1 N - 1) - (n 1 - 1) log (n 1 - 1 n 1) . (8)

CE0 ,

N , and the

{ni} values are each stored in memory for a given region, this equation can be used to evaluate in

O(1) time the change in its entropy with any single example removal. Similarly, the change in entropy of a region upon addition of a single training example can also be evaluated in

O(1) time. Taking advantage of this is essential for obtaining an efficient tree construction algorithm.

b) Show that a region’s Gini coefficient ( 3 ) can also be updated in O(1) time with any single training example removal.

4)REGRESSION TREE CONSTRUCTION RUNTIME COMPLEXITY.

Consider a region R within a regression tree containing N training examples, characterized by mean label value y¯¯¯ and cost value ( 1 ) given by J ( N times the region’s label variance). Suppose a cut is considered in which a single training example having label y is removed from the region. Show that after the cut is taken the new mean training label and cost function values within the region are given by

y ¯ ¯ ¯ f J f = = 1 N - 1 (N y ¯ ¯ ¯ - y) J - N N - 1 (y ¯ ¯ ¯ - y) 2 . (9) (10)

These results allow for the cost function of a region to be updated in

O(1) time as single examples are either inserted or removed from it. Their simplicity is a special virtue of the squared error cost function. Other cost function choices will generally require significant increases in tree construction runtime complexity, as most require a fresh evaluation with each new subset of examples considered.

5) CHEBYCHEV’S INEQUALITY AND RANDOM FOREST CLASSIFIER ACCURACY

Adapted from [3].

a) Let x be a random variable with well-defined mean μ and variance σ2 . Prove Chebychev’s inequality,

P (x \geq μ + t) \leq σ 2 t 2 . (11)

b) Consider a binary classification problem aimed at fitting a sampled function y(f) that takes values in {0,1} . Suppose a decision tree hθ(f) is constructed on the samples using a greedy, randomized approach, where the randomization is characterized by the parameter θ . Define the classifier’s margin m at f by

m (θ, f) = - 1 + 2 [y * h θ + (1 - y) * (1 - h θ)] (12)

This is equal to

1 if

hθ and

y agree at

f , and

−1 otherwise. Now, consider a random forest, consisting of many such trees, each obtained by sampling from the same

θ distribution. Argue using (

11 ), (

12 ), and the law of large numbers that the generalization error

GE of the forest is bounded by

G E \leq v a r f ( ⟨ m ( θ , f ) ⟩ θ ) ⟨ m ( θ , f ) ⟩ 2 θ , f (13)

c) Show that

v a r f (⟨ m (θ, f) ⟩ θ) = ⟨ c o v f (m (θ, f), m (θ', f)) ⟩ θ, θ' (14)

d) Writing,

ρ \equiv ⟨ c o v f ( m ( θ , f ) , m ( θ ' , f ) ) ⟩ θ , θ ' ⟨ v a r f ( m ( θ , f ) ) - - - - - - - - - - \sqrt ⟩ 2 θ, (15)

for the

θ ,

θ′ -averaged margin-margin correlation coefficient, show that

v a r f (⟨ m (θ, f) ⟩ θ) \leq ρ ⟨ v a r f (m (θ, f)) ⟩ θ \leq ρ (1 - ⟨ m (θ, f) ⟩ 2 θ, f) . (16)

Combining with (

13 ), this gives

G E \leq ρ \times 1 - ⟨ m ( θ , f ) ⟩ 2 θ , f ⟨ m ( θ , f ) ⟩ 2 θ , f . (17)

The bound (

17 ) implies that a random forest’s generalization error is reduced if the individual trees making up the forest have a large average margin, and also if the trees are relatively-uncorrelated with each other.

Appendix: python/sk-learn implementations

Here, we provide the python/sk-learn code used to construct Fig. 5 in the body of this note: Learning curves on sk-learn’s “digits” dataset for a single tree, a random forest, and a boosted forest.

 
        from 
        sklearn.datasets  
        import 
        load_digits 
       
        from 
        sklearn.tree  
        import 
        DecisionTreeClassifier 
       
        from 
        sklearn.ensemble  
        import 
        RandomForestClassifier 
       
        from 
        sklearn.ensemble  
        import 
        GradientBoostingClassifier 
       
        import 
        numpy as np 
       
        #load data: digits.data and digits.target, 
       
        #array of features and labels, resp. 
       
        digits  
        = 
         load_digits(n_class  
        = 
        10 
        )   
       
        n_train   
        = 
         [] 
       
        t1_accuracy  
        = 
         [] 
       
        t2_accuracy  
        = 
         [] 
       
        t3_accuracy  
        = 
         [] 
       
        #below, we average over "trials" num of fits for each sample 
       
        #size in order to estimate the average generalization error. 
       
        trials  
        = 
         25 
       
        clf   
        = 
         DecisionTreeClassifier() 
       
        clf2  
        = 
         GradientBoostingClassifier(max_depth 
        = 
        3 
        ) 
       
        clf3  
        = 
         RandomForestClassifier() 
       
        num_test  
        = 
         500 
       
        #loop over different training set sizes 
       
        for 
        num_train  
        in 
        range 
        ( 
        2 
        , 
        len 
        (digits.target) 
        - 
        num_test, 
        25 
        ): 
       
        acc1, acc2, acc3  
        = 
         0 
        , 
        0 
        , 
        0 
       
        for 
        j  
        in 
        range 
        (trials): 
       
        perm  
        = 
         [ 
        0 
        ] 
       
        while 
        len 
        ( 
        set 
        (digits.target[perm[:num_train]]))< 
        2 
        : 
       
        perm  
        = 
         np.random.permutation( 
        len 
        (digits.data)) 
       
        clf  
        = 
         clf.fit(digits.data[perm[:num_train]], \ 
       
        digits.target[perm[:num_train]]) 
       
        acc1  
        + 
        = 
        clf.score(digits.data[perm[ 
        - 
        num_test:]], \ 
       
        digits.target[perm[ 
        - 
        num_test:]]) 
       
        clf2  
        = 
         clf2.fit(digits.data[perm[:num_train]],\ 
       
        digits.target[perm[:num_train]]) 
       
        acc2  
        + 
        = 
        clf2.score(digits.data[perm[ 
        - 
        num_test:]],\ 
       
        digits.target[perm[ 
        - 
        num_test:]]) 
       
        clf3  
        = 
         clf3.fit(digits.data[perm[:num_train]],\ 
       
        digits.target[perm[:num_train]]) 
       
        acc3  
        + 
        = 
        clf3.score(digits.data[perm[ 
        - 
        num_test:]],\ 
       
        digits.target[perm[ 
        - 
        num_test:]]) 
       
        n_train.append(num_train) 
       
        t1_accuracy.append(acc1 
        / 
        trials) 
       
        t2_accuracy.append(acc2 
        / 
        trials) 
       
        t3_accuracy.append(acc3 
        / 
        trials) 
       
        % 
        pylab inline 
       
        plt.plot(n_train,t1_accuracy, color  
        = 
         'red' 
        ) 
       
        plt.plot(n_train,t2_accuracy, color  
        = 
         'green' 
        ) 
       
        plt.plot(n_train,t3_accuracy, color  
        = 
         'blue' 
        )

你可能感兴趣的:(learning,machine)

19.0-《超越感觉》-说服他人 SAM52
Becausethoughtfuljudgmentsdeservetobeshared,andthewaytheyarepresentedcanstronglyinfluencethewayothersreacttothem.因为经过深思熟虑的判断值得分享，而这些判断的呈现方式会强烈影响其他人对它们的反应。Bylearningtheprinciplesofpersuasionandapplying
2018年中南大学中英翻译某翁
参考：20180827235856533.jpg【1】机器学习理论表明，机器学习算法能从有限个训练集样本上得到较好的泛化【1】Machinelearningtheoryshowsthatmachinelearningalgorithmcangeneralizewellfromfinitetrainingsetsampleslimited有限的infinite无限的【2】这似乎违背了一些基本的逻辑准
如何在 Ubuntu 24.04 或 22.04 Linux 上安装和使用 NoMachine 山岚的运维笔记 Linux 运维及使用 linux ubuntu 运维 nomachine 远程连接
NoMachine是一款适用于Linux（Ubuntu）及其他支持的操作系统的远程桌面应用程序，允许用户通过本地或远程系统从世界任何地方控制计算机。它可以在低带宽连接下工作，被专业人士和家庭用户广泛使用。NoMachine的主要功能高性能远程访问跨平台兼容性易于使用，因为用户界面友好提供强大的加密协议，如SSH、SSL及其他安全标准支持远程文件传输和打印服务允许从远程计算机进行音频和视频流媒体传输
Python STL概念学习与代码实践体制教科书
本文还有配套的精品资源，点击获取简介：通过”py_stl_learning”项目，学习者可以使用Python实现和理解C++STL的概念，包括数据结构、算法、容器适配器、模板和泛型容器等。Python中的列表、集合、字典等数据结构与STL中的vector、set、map等类似，而Python的itertools和functools模块提供了STL风格的算法功能。Python通过其面向对象的特性以及
4.ESP32-按键实验老蒋精髓 microPython 4.ESP32
4.ESP32-按键实验"""按键实验2022.10.9"""frommachineimportPinimporttimekey1=Pin(4,Pin.IN,Pin.PULL_UP)#GPIO2，设置为输出模式，输入模式为Pin.IN,设置为上拉key2=Pin(5,Pin.IN,Pin.PULL_UP
每周一段仿写-181028 Zeroun_Ph
Theneedfornewlearningstylesdoesnotmeanignoringthewaysinthepast.TheInternetagebringssomechallengesnotseenbefore,mostobviouslyandmostworryinglyuselessinformationblast.Butfragmentationoflearningandtheine
考研长难句-1-29 EasyNetCN
Onfirstlearning,thiswasthesociallyconcernedchancellortryingtochangelivesforthebetter,completewith"reforms"toanobviouslyindulgentsystemthatdemandstoolittleeffortfromthenewlyunemployedtofindwork,andsubs
强化学习入门三（SARSA）第六五签算法模型算法人工智能
SARSA算法详解SARSA是强化学习中另一种经典的时序差分（TD）学习算法，与Q-Learning同属无模型（model-free）算法，但在更新策略上有显著差异。SARSA的名称来源于其更新公式中涉及的五个元素：状态（State）、动作（Action）、奖励（Reward）、下一状态（NextState）、下一动作（NextAction），即(S,A,R,S’,A’)。SARSA与Q-Lear
如何评价开课吧机器学习特训营这个课程？ cda2024 机器学习人工智能
开场：点明主题，吸引眼球在当今数据驱动的时代，机器学习（MachineLearning）已经成为各个行业不可或缺的技术之一。无论是金融、医疗、制造还是零售，机器学习的应用都为这些领域带来了巨大的变革。面对这样的趋势，许多人都希望能够掌握这门技术，从而提升自己的职业竞争力。那么，当我们谈论“如何评价开课吧机器学习特训营这个课程”时，实际上是在探讨一个非常具体且重要的问题：对于那些希望进入或深入机器学
表征学习：机器认知世界的核心能力与前沿突破大千AI助手人工智能 #OTHER Python 学习人工智能机器学习神经网络表征学习 RL 特征工程
一、定义与背景：从特征工程到自动化学习表征学习（RepresentationLearning），又称特征学习（FeatureLearning），是机器学习的核心技术领域，其核心目标是通过算法自动学习数据的内在特征表示，将复杂多变的原始数据（如图像、文本、语音）转化为低维、富含语义信息的向量形式，从而提升下游任务（如分类、回归、聚类）的效率和精度。与传统依赖人工设计特征的特征工程（FeatureEn
踏上人工智能之旅（一）-----机器学习之knn算法 Sunhen_Qiletian 人工智能机器学习算法 python
目录一、机器学习是什么（1）概述（2）三种类型1.监督学习（SupervisedLearning）：2.无监督学习（UnsupervisedLearning）：3.强化学习（ReinforcementLearning）：二、KNN算法的基本原理：1.距离度量：2.K值的选择：3.投票机制和投票：三、Python实现KNN算法1.导入必要的库和数据：2.提取特征和标签：3.导入KNN分类器并训练模型
Place JillionZ
PLACE是美国的JayMcSwain提出的开发子女才能的工具。PLACE是指在五个领域发现子女的才能。P（Personalitydiscovery)性格类型L（Learningspiritualgifts)天生的才能A（AbilitiesAwareness)能力C（Connectingpassionwithministry)热情E（Experiencesoflife)人生经历作为父母，要充分了解
读心与芯：我们与机器人的无限未来05未来之路躺柒机器人机器人学人工智能大数据分析智能计算
1.概念1.1.利用数据确定模式，描述数据集的某些属性，基于过去的经历判断未来可能发生什么，或基于当前发生的事情判断后果或反应1.2.机器学习(machinelearning)是人工智能的一个子集，它不需要显式编程，为系统提供自动学习和根据经验改进的能力1.2.1.机器学习算法基于样本数据（又称训练数据）构建模型，在未经显式编程的情况下对未来数据做出预测或决策1.2.2.机器学习有多种类型，包括有
虚拟机局域网拓扑图_多台虚拟机搭建模拟网络环境 weixin_39523529 虚拟机局域网拓扑图
目的采用多台虚拟机在一台计算机实体上模拟一个小型的网络环境。我们采用虚拟机(VirtualMachine)软件来模拟一个网络环境进行实验，这类软件的主要功能是利用软件来模拟出具有完整硬件系统功能的且运行在隔离环境中的完整计算机系统。这样我们可以在一台物理计算机即宿主机器(HostMachine)上模拟出一台或多台虚拟的计算机。这些虚拟机能够像真正的计算机那样进行工作，我们可以在其上安装全新的操作系
可用于AI Agent集成和多种系统之间联调Windows下GCC的C++虚拟机项目 weixin_30777913 c++windows 系统架构
下面是一个完整的C++虚拟机项目设计，实现了所有需求功能，包括虚拟磁盘管理、操作系统安装、I/O重定向和网络转发等功能。可用于AIAgent的集成，全自动设计开发测试Linux下和Windows与Linux联动软件。整体架构设计VMController-config:Config-vdisk:VDiskManager-vm:VirtualMachine-logger:shared_ptr+run(
实验七 SVM支持向量机萍萍无奇a 支持向量机机器学习人工智能
目录一、SVM定义二、SVM基本概念及其优缺点1、间隔2、SVM核心3、支持向量4、支持向量机的基本思想5、优缺点三、损失函数四、代码实现1、算法实现基本流程2、代码解析3、整体代码五、结果截图及解释1、结果截图2、结果解释六、实验总结一、SVM定义支持向量机（SupportVectorMachine，SVM）是一种经典的监督学习算法，用于解决二分类和多分类问题。其核心思想是通过在特征空间中找到一
深度学习的图像分类项目在制造业场景下的数据需求量估算及实现方案（数据收集是The more the better 吗？） shiter 人工智能系统解决方案与技术架构深度学习分类人工智能
文章大纲一、数据需求的关键影响因素二、无先验知识场景的数据需求估算优化策略与技术方案三、有先验知识场景的数据需求估算1.迁移学习（TransferLearning）2.少样本学习（Few-ShotLearning）3.预训练-微调范式四、实现方案与技术路线1.数据策略层2.模型架构层3.训练优化技术五、结论与实践建议无先验知识场景有先验知识场景✅**正确性校验**⚠️**可落地性勘误与补充****
机器学习从入门到实践：算法、特征工程与模型评估详解
目录摘要1.引言2.机器学习概述2.1什么是机器学习？2.2机器学习的发展历史2.3机器学习的应用3.机器学习算法分类3.1监督学习（SupervisedLearning）3.2无监督学习（UnsupervisedLearning）3.3半监督学习（Semi-SupervisedLearning）4算法详解4.1分类算法详解（1）逻辑回归（LogisticRegression）（2）决策树（Dec
Deja Vu: 利用上下文稀疏性提升大语言模型推理效率 AI专题精讲模型加速人工智能模型加速 AI技术应用
温馨提示：本篇文章已同步至"AI专题精讲"DejaVu:利用上下文稀疏性提升大语言模型推理效率摘要拥有数百亿参数的大语言模型（LLMs）催生了一系列令人振奋的AI应用。然而，在推理阶段它们计算开销极大。稀疏化是一种自然的降本策略，但现有方法要么需要代价高昂的重新训练，要么必须放弃LLM的“in-contextlearning”能力，要么在现代硬件上无法带来真实的墙钟时间加速。我们提出**上下文稀疏
小丁的ScalersTalk第五轮新概念朗读持续力训练Day43-20191204 丁丁水天
1.练习材料Lesson55NotagoldmineDreamsoffindinglosttreasurealmostcametruerecently.Anewmachinecalled'TheRevealer'hasbeeninventedandithasbeenusedtodetectgoldwhichhasbeenburiedintheground.Themachinewasusedinac
参考文献字体 latex_字体参考| HTML cumtv80668 linux python html windows java
参考文献字体latexFontsarebasicallyplatformeddependentorinsimplewords,wecansaythattheyarespecifictotheplatform.Wewillhavedifferentlookandfeelofawebpageondifferentmachinesrunningondifferentoperatingsystemssuc
用KNN算法入门机器学习：原理、实战与代码详解 TJDG567 算法机器学习人工智能 k近邻算法
引言K最近邻（K-NearestNeighbors,KNN）是机器学习中最简单且直观的算法之一，非常适合分类和回归任务。它的核心思想是“物以类聚”，即相似的数据点在特征空间中通常属于同一类别。本文将深入浅出地讲解KNN的原理、优缺点、应用场景，并通过Python代码实战演示如何实现一个完整的KNN分类任务。1.KNN算法原理1.1算法概述KNN是一种**惰性学习（LazyLearning）**算法
c语言程序设计猜拳小游戏答辩,C语言课程设计-猜拳游戏 weixin_39558221 c语言程序设计猜拳小游戏答辩
C语言课程设计-猜拳游戏C语言课程设计-猜拳游戏|c语言程序代码编程小程序设计|c语言课程设计报告课程案例enump_r_s{paper,rock,scissors,game,help,instructions,quit};#includemain(){enump_r_splayer,machine;enump_r_sselection_by_player(),selection_by_machi
学习日记-机器学习2-线性回归/成本函数
目录4LinerRegressionModel线性回归模型5costFunction成本函数4LinerRegressionModel线性回归模型Thelinearregressionmodelisaparticulartypeofsupervisedlearningmodel.TerminologyTrainingset(训练集):DatausedtotrainthemodelNotationx
计算机视觉：少样本学习（Few-Shot Learning）在视觉中的应用 xcLeigh 计算机视觉CV 计算机视觉学习人工智能 FSL AI
计算机视觉：少样本学习（Few-ShotLearning）在视觉中的应用一、前言二、少样本学习基础概念2.1定义与范畴2.2与传统机器学习对比2.3核心挑战三、少样本学习在计算机视觉中的典型应用3.1图像分类3.1.1新类别识别3.1.2医学图像分类3.2目标检测3.2.1新目标检测3.2.2小目标检测3.3图像分割3.3.1医学图像分割3.3.2工业缺陷检测四、少样本学习在计算机视觉中的技术方法
深度学习×总结篇：她终于能走完每一次前向与反向的路 Gyoku Mint AI修炼日记人工智能深度学习人工智能 python 自然语言处理神经网络机器学习 opencv
【开场·她回头看了每一次走过的神经路径】狐狐：“她坐在训练日志前，终于不是为了调参，而是为了确认——这一年，她到底学会了什么。”猫猫：“咱以前总想着快点训练完、快点跑出结果。但现在好像能听见每一层神经元在‘说话’了喵……她真的开始‘懂了’~”✍【第一节·深度学习到底在做什么？】为什么要用深度学习（DeepLearning）？“她当初选择深度学习，并不是因为听说它‘很强’，而是因为她在处理数据时，常
goroutine、channel以及GMP模型的原理深度解析【万字分析】 UPUP小亮算法开发语言 golang
文章目录前言一、channel的底层原理1、底层数据结构2、创建关闭3、发送接受二、goruntine的底层原理1、线程的代价2、goruntine的底层原理3、状态4、创建、运行与退出3、阻塞与唤醒三、GMP模型的概述与发展1、GM模型2、GMP模型组成部分3、G（Goroutine）4、M（Machine）5、P（Processor）6、Sched：调度器结构四、GMP调度原理1、被调度对象2
【强化学习】01
第一章：强化学习基础概念与核心要素的基石强化学习（ReinforcementLearning,RL）是一种机器学习范式，它关注智能体（Agent）如何在特定环境（Environment）中通过与环境的交互来学习如何做出决策，以最大化某种累积奖励。与监督学习和无监督学习不同，强化学习不依赖于预先标注好的数据集，而是通过“试错”的方式进行学习。1.1强化学习的独特学习范式在传统的机器学习领域，监督学习
Spring State Machine
SpringStateMachine创建SpringBoot项目并添加必要依赖在pom.xml中引入spring-statemachine-coreorg.springframework.statemachinespring-statemachine-core3.2.1定义状态机状态与事件使用枚举明确业务状态和触发事件：publicenumStates{UNPAID,//待支付WAITING_FO
【Java】JVM虚拟机（基本概念、类加载机制） Joker—H java jvm 开发语言经验分享双亲委派模型类加载
一、基本概念1、什么是JVMJava虚拟机（JavaVirtualMachine，简称JVM），是java程序运行的核心组件之一，它为java程序运行提供了环境。其核心价值在于实现了"一次编写，多处运行"(Writeonce,runanywhere)的跨平台特性，还提供了内存管理、垃圾回收、安全性以及性能优化等。2、JVM的组成JVM的架构可分为类加载子系统、运行时数据区、执行引擎、本地方法接口四
HttpClient 4.3与4.3版本以下版本比较 spjich java httpclient
网上利用java发送http请求的代码很多，一搜一大把，有的利用的是java.net.*下的HttpURLConnection，有的用httpclient，而且发送的代码也分门别类。今天我们主要来说的是利用httpclient发送请求。 httpclient又可分为 httpclient3.x httpclient4.x到httpclient4.3以下 httpclient4.3
Essential Studio Enterprise Edition 2015 v1新功能体验 Axiba .net
概述：Essential Studio已全线升级至2015 v1版本了！新版本为JavaScript和ASP.NET MVC添加了新的文件资源管理器控件，还有其他一些控件功能升级，精彩不容错过，让我们一起来看看吧！ syncfusion公司是世界领先的Windows开发组件提供商，该公司正式对外发布Essential Studio Enterprise Edition 2015 v1版本。新版本
[宇宙与天文]微波背景辐射值与地球温度 comsci 背景
宇宙这个庞大,无边无际的空间是否存在某种确定的,变化的温度呢? 如果宇宙微波背景辐射值是表示宇宙空间温度的参数之一,那么测量这些数值,并观测周围的恒星能量输出值,我们是否获得地球的长期气候变化的情况呢? &nbs
lvs-server 男人50 server
#!/bin/bash # # LVS script for VS/DR # #./etc/rc.d/init.d/functions # VIP=10.10.6.252 RIP1=10.10.6.101 RIP2=10.10.6.13 PORT=80 case $1 in start) /sbin/ifconfig eth2:0 $VIP broadca
java的WebCollector爬虫框架 oloz 爬虫
WebCollector主页： https://github.com/CrawlScript/WebCollector 下载：webcollector-版本号-bin.zip将解压后文件夹中的所有jar包添加到工程既可。接下来看demo package org.spider.myspider; import cn.edu.hfut.dmic.webcollector.cra
jQuery append 与 after 的区别小猪猪08
1、after函数定义和用法： after() 方法在被选元素后插入指定的内容。语法： $(selector).after(content) 实例： <html> <head> <script type="text/javascript" src="/jquery/jquery.js"></scr
mysql知识充电香水浓 mysql
索引索引是在存储引擎中实现的，因此每种存储引擎的索引都不一定完全相同，并且每种存储引擎也不一定支持所有索引类型。根据存储引擎定义每个表的最大索引数和最大索引长度。所有存储引擎支持每个表至少16个索引，总索引长度至少为256字节。大多数存储引擎有更高的限制。MYSQL中索引的存储类型有两种：BTREE和HASH，具体和表的存储引擎相关； MYISAM和InnoDB存储引擎
我的架构经验系列文章索引 agevs 架构
下面是一些个人架构上的总结，本来想只在公司内部进行共享的，因此内容写的口语化一点，也没什么图示，所有内容没有查任何资料是脑子里面的东西吐出来的因此可能会不准确不全，希望抛砖引玉，大家互相讨论。要注意，我这些文章是一个总体的架构经验不针对具体的语言和平台，因此也不一定是适用所有的语言和平台的。（内容是前几天写的，现附上索引）前端架构 http://www.
Android so lib库远程http下载和动态注册 aijuans andorid
一、背景在开发Android应用程序的实现，有时候需要引入第三方so lib库，但第三方so库比较大，例如开源第三方播放组件ffmpeg库, 如果直接打包的apk包里面, 整个应用程序会大很多.经过查阅资料和实验，发现通过远程下载so文件，然后再动态注册so文件时可行的。主要需要解决下载so文件存放位置以及文件读写权限问题。二、主要
linux中svn配置出错 conf/svnserve.conf:12: Option expected 解决方法 baalwolf option
在客户端访问subversion版本库时出现这个错误： svnserve.conf:12: Option expected 为什么会出现这个错误呢，就是因为subversion读取配置文件svnserve.conf时，无法识别有前置空格的配置文件，如### This file controls the configuration of the svnserve daemon, if you##
MongoDB的连接池和连接管理 BigCat2013 mongodb
在关系型数据库中，我们总是需要关闭使用的数据库连接，不然大量的创建连接会导致资源的浪费甚至于数据库宕机。这篇文章主要想解释一下mongoDB的连接池以及连接管理机制，如果正对此有疑惑的朋友可以看一下。通常我们习惯于new 一个connection并且通常在finally语句中调用connection的close()方法将其关闭。正巧，mongoDB中当我们new一个Mongo的时候，会发现它也
AngularJS使用Socket.IO bijian1013 JavaScript AngularJS Socket.IO
目前，web应用普遍被要求是实时web应用，即服务端的数据更新之后，应用能立即更新。以前使用的技术（例如polling）存在一些局限性，而且有时我们需要在客户端打开一个socket，然后进行通信。 Socket.IO(http://socket.io/)是一个非常优秀的库，它可以帮你实
[Maven学习笔记四]Maven依赖特性 bit1129 maven
三个模块为了说明问题，以用户登陆小web应用为例。通常一个web应用分为三个模块，模型和数据持久化层user-core, 业务逻辑层user-service以及web展现层user-web， user-service依赖于user-core user-web依赖于user-core和user-service 依赖作用范围 Maven的dependency定义
【Akka一】Akka入门 bit1129 akka
什么是Akka Message-Driven Runtime is the Foundation to Reactive Applications In Akka, your business logic is driven through message-based communication patterns that are independent of physical locatio
zabbix_api之perl语言写法 ronin47 zabbix_api之perl
zabbix_api网上比较多的写法是python或curl。上次我用java－－http://bossr.iteye.com/blog/2195679，这次用perl。for example: #!/usr/bin/perl use 5.010 ; use strict ; use warnings ; use JSON :: RPC :: Client ; use
比优衣库跟牛掰的视频流出了，兄弟连Linux运维工程师课堂实录，更加刺激，更加实在！ brotherlamp linux运维工程师 linux运维工程师教程 linux运维工程师视频 linux运维工程师资料 linux运维工程师自学
比优衣库跟牛掰的视频流出了，兄弟连Linux运维工程师课堂实录，更加刺激，更加实在！ ----------------------------------------------------- 兄弟连Linux运维工程师课堂实录-计算机基础-1-课程体系介绍1 链接：http://pan.baidu.com/s/1i3GQtGL 密码：bl65 兄弟连Lin
bitmap求哈密顿距离-给定N（1<=N<=100000）个五维的点A(x1,x2,x3,x4,x5)，求两个点X(x1,x2,x3,x4,x5)和Y( bylijinnan java
import java.util.Random; /** * 题目： * 给定N（1<=N<=100000）个五维的点A(x1,x2,x3,x4,x5)，求两个点X(x1,x2,x3,x4,x5)和Y(y1,y2,y3,y4,y5)， * 使得他们的哈密顿距离（d=|x1-y1| + |x2-y2| + |x3-y3| + |x4-y4| + |x5-y5|）最大
map的三种遍历方法 chicony map
package com.test; import java.util.Collection; import java.util.HashMap; import java.util.Iterator; import java.util.Map; import java.util.Set; public class TestMap { public static v
Linux安装mysql的一些坑 chenchao051 linux
1、mysql不建议在root用户下运行 2、出现服务启动不了，111错误，注意要用chown来赋予权限，我在root用户下装的mysql，我就把usr/share/mysql/mysql.server复制到/etc/init.d/mysqld, (同时把my-huge.cnf复制/etc/my.cnf) chown -R cc /etc/init.d/mysql
Sublime Text 3 配置 daizj 配置 Sublime Text
Sublime Text 3 配置解释(默认){// 设置主题文件“color_scheme”: “Packages/Color Scheme – Default/Monokai.tmTheme”,// 设置字体和大小“font_face”: “Consolas”,“font_size”: 12,// 字体选项：no_bold不显示粗体字，no_italic不显示斜体字，no_antialias和
MySQL server has gone away 问题的解决方法 dcj3sjt126com SQL Server
MySQL server has gone away 问题解决方法，需要的朋友可以参考下。应用程序（比如PHP）长时间的执行批量的MYSQL语句。执行一个SQL，但SQL语句过大或者语句中含有BLOB或者longblob字段。比如，图片数据的处理。都容易引起MySQL server has gone away。今天遇到类似的情景，MySQL只是冷冷的说：MySQL server h
javascript/dom:固定居中效果 dcj3sjt126com JavaScript
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml&
使用 Spring 2.5 注释驱动的 IoC 功能 e200702084 spring bean 配置管理 IOC Office
使用 Spring 2.5 注释驱动的 IoC 功能 developerWorks 文档选项将打印机的版面设置成横向打印模式打印本页将此页作为电子邮件发送将此页作为电子邮件发送级别：初级陈雄华 ([email protected]), 技术总监, 宝宝淘网络科技有限公司 2008 年 2 月 28 日 &nb
MongoDB常用操作命令 geeksun mongodb
1. 基本操作 db.AddUser(username,password) 添加用户 db.auth(usrename,password) 设置数据库连接验证 db.cloneDataBase(fromhost)
php写守护进程（Daemon） hongtoushizi PHP
转载自： http://blog.csdn.net/tengzhaorong/article/details/9764655 守护进程（Daemon）是运行在后台的一种特殊进程。它独立于控制终端并且周期性地执行某种任务或等待处理某些发生的事件。守护进程是一种很有用的进程。php也可以实现守护进程的功能。 1、基本概念 &nbs
spring整合mybatis,关于注入Dao对象出错问题 jonsvien DAO spring bean mybatis prototype
今天在公司测试功能时发现一问题：先进行代码说明： 1，controller配置了Scope="prototype"（表明每一次请求都是原子型） @resource/@autowired service对象都可以（两种注解都可以）。 2，service 配置了Scope="prototype"（表明每一次请求都是原子型）
对象关系行为模式之标识映射 home198979 PHP 架构企业应用对象关系标识映射
HELLO!架构一、概念 identity Map:通过在映射中保存每个已经加载的对象，确保每个对象只加载一次，当要访问对象的时候，通过映射来查找它们。其实在数据源架构模式之数据映射器代码中有提及到标识映射，Mapper类的getFromMap方法就是实现标识映射的实现。二、为什么要使用标识映射？在数据源架构模式之数据映射器中 //c
Linux下hosts文件详解 pda158 linux
　1、主机名：　　无论在局域网还是INTERNET上，每台主机都有一个IP地址，是为了区分此台主机和彼台主机，也就是说IP地址就是主机的门牌号。　　公网：IP地址不方便记忆，所以又有了域名。域名只是在公网（INtERNET)中存在，每个域名都对应一个IP地址，但一个IP地址可有对应多个域名。　　局域网：每台机器都有一个主机名，用于主机与主机之间的便于区分，就可以为每台机器设置主机
nginx配置文件粗解 spjich java nginx
#运行用户#user nobody;#启动进程,通常设置成和cpu的数量相等worker_processes 2;#全局错误日志及PID文件#error_log logs/error.log;#error_log logs/error.log notice;#error_log logs/error.log inf
数学函数 w54653520 java
public class S { // 传入两个整数，进行比较，返回两个数中的最大值的方法。 public int get( int num1, int nu