【现代C++】"可选"在C++中的表达--std::optional<>

背景

我们在很多编程场合下都需要用到“可选”的概念，比如可选的参数，可选的返回值等。但对这一方面，传统C/C++支持得略显不足。下面通过几个实例说明这一问题。

二分查找

在二分查找算法中，有可能我们要查找的值不在集合里，这时我们该怎么表示呢？二分算法在前面的文章中有提供，给出了Python和Haskell版本：

#python
def binary_search(list, item):
    low = 0
    high = len(list)—1
    
    while low <= high:
        mid = (low + high)
        guess = list[mid]
        if guess == item:
            return mid
        if guess > item:
            high = mid - 1
        else:
            low = mid + 1
    return None

--Haskell
import qualified Data.Vector as V

binarySearch :: (Ord a)=>  V.Vector a -> Int -> Int -> a -> Maybe Int
binarySearch vec low high e
          | low > high = Nothing
          | vec V.! mid > e = binarySearch vec (mid+1) high e
          | vec V.! mid < e = binarySearch vec low (mid-1) e
          | otherwise = Just mid
          where
              mid = low + ((high-low) `div` 2)

可以看出，Python使用了None表示值找不到，Haskell使用Nothing表示元素找不到，都没使用一些特定的数字来表示找不到的错误；两者大同小异，都表示函数返回值是"可选"的，即返回结果可能失败。最直观的好处是：使用类型表示这种情况可以给调用者更多显式的返回结果的信息，函数可读性更高。

而在传统的C/C++里是没有相应支持的，我们只能：

int binary_search(const std::vector &list, int item) {
    size_t low{0};
    size_t high{list.size() - 1};

    while (low <= high) {
        auto mid = (low + high);
        auto guess = list[mid];
        if (guess == item) {
            return mid;
        } else if (guess > item) {
            high = mid - 1;
        } else {
            low = mid + 1;
        }
    }

    return -1;
}

在这里我们使用特定值-1表示item没有找到。

字符串查找函数

同样，作为函数参数，我们在某些情况下也有参数可选的需求。如果我们调用函数时，若不指定该参数，会使用参数的默认值填充该参数。在标准库中，很多函数使用了这一策略。

比如：标准库std::string类中的成员函数：
size_type find( const basic_string& str, size_type pos = 0 ) const noexcept;
size_type find_last_of( const basic_string& str, size_type pos = npos ) const noexcept;

一个正向查找，一个反向查找，pos参数默认取一个特定的值，在这里分别取0和std::string::npos。

然而在函数类型中，参数的类型仍然是size_type，并没有给调用者提供多少有用的信息。在其他语言中，这方面做的要相对更好，比如Haskell中，我们仍然可以使用Maybe T类型作为函数的参数，一目了然就可以看出这个参数需要处理可选情况。

下面我们讨论传统方式都有哪些缺点。

传统方式的缺点

从以上两个应用实例可看出，传统方式实际上就是通过特定的值表示“可选”的概念。这种方式有什么缺点呢？

从数据类型无法看出可选语义

输入参数通过默认参数机制实现，相对来说还能看出点信息；但返回值可选的情况，我们完全从函数签名里看不出来一点信息，只能通过API文档得知。

可选惯例不统一，规格多样

按照惯例，通过找不到的情况，都会使用-1、nullptr等无意义的值；但惯例对编译器是没有约束力的，只能人为遵守，所以很有可能某些函数没有按照惯例来，最后导致的：不同的库惯例不一致，甚至同一个库不同人写的函数使用的惯例也不一致，千差万别，会提高使用的成本。当然，标准库是比较统一的，但这只是暂时掩盖了问题，而没有根除问题发生的原因。

输入参数取值更加不统一，有些人喜欢使用有效的参数值作为默认参数，像find函数那样；有些人喜欢使用无效值作为默认参数，像find_last_of一样。使用有效值的优点是有助于理解，但某些情况下无法使用有效值，比如find_last_of的情况，因为字符串的大小是没法静态知道的。使用无效值避免了有效值的问题，但引发其他问题：偏函数的时候可以找到无效值，但全函数对于所有的参数都是有效的，这怎么找?

所以，由于以上缺点，C++终于在C++17引入了std::optional<>工具。

std::optional<>

该工具相对容易使用，需要引入头文件#include 。

下面分三块说明其使用方式：

函数参数可选

假设要改造标准库的find函数，我们只需将签名修改为：
size_type find( const basic_string& str, std::optional pos = std::nullopt) const noexcept

可以看到，pos已经成为可选类型optional，同时我们使用std::nullopt常量作为其默认值。std::nullopt是标准库定义的特殊常量，用来表示pos参数没有被赋值过。

即使参数换了类型，对函数的调用方式没有任何影响。我们仍然可以这么调用：

std::string line {"abcd123445555"};

line.find("add"); //使用默认值
line.find("add", 1); //从第二个字符开始

函数返回值可选

参照参数类型的改动，依葫芦画瓢地修改binary_search为:

std::optional binary_search(const std::vector &list, int item) {
    size_t low{0};
    size_t high{list.size() - 1};

    while (low <= high) {
        auto mid = (low + high);
        auto guess = list[mid];
        if (guess == item) {
            return mid;
        } else if (guess > item) {
            high = mid - 1;
        } else {
            low = mid + 1;
        }
    }

    return std::nullopt;
}

跟参数赋值一样，由于std::optional提供了类型T到std::optional的赋值转换，我们可以直接返回T类型的值。

处理std::optional类型的参数或返回值

处理可选参数和可选返回值的操作是一样的，我们以处理可选返回值为例说明。

...
auto found = binary_search(list, 2);

////因为标准库提供了到bool的默认类型转换，可以直接使用if判断
if (found) {
    std::cout << "found " << *found ; //可使用*found取值
}

//我们也可以这样使用has_value()成员函数
if (found.has_value()) {
    std::cout << "found " << found->value(); //使用成员函数value取值
}

//因为已经对操作符重载，我们还可以使用.
if (found != std::nullopt) {
    std::cout << "found " << (*found).value(); 
}

如果我们不判断found是否包含有效值而直接使用，此时可能会抛出std::bad_optional_access异常，需要捕捉；

try {
    int n = found.value();
} catch(const std::bad_optional_access& e) {
    std::cout << e.what() << '\n';
}

捕捉异常会让执行流程中断，如果我们取到无效值的时候按0处理，可以：

int n = found.value_or(0);

这样能让流程更平滑地执行下去。

代码示例

以上就是该工具主要的用法，我们用一个例子结束该篇文章。模拟用户登录场景：用户使用登录名获取用户ID，从而完成登录。我们简单模拟了这个过程，定义了两个函数，get_user_from_login_name和write_login_log，函数比较简单，就不解释了。这里简化了登录场景，只要用户登录名在系统内存在就算登录成功。

#include
#include 
#include 
#include 

void write_login_log(int user_id, std::optional cur_time = std::nullopt) {

    time_t cur = 0;
    if (cur_time) {
        cur = *cur_time;
    } else {
        cur = time(nullptr);

    }
    std::cout << "User: " << user_id << ", time: " << cur << std::endl;
}

std::optional get_user_from_login_name(const std::string &login_name) {
    std::map map_login{{"login1", 1},
                                         {"login2", 2}};

    auto found = map_login.find(login_name);
    if (found != map_login.cend()) {
        return found->second;
    }

    return std::nullopt;
}

int main() {

    auto user = get_user_from_login_name("login1");

    if (user) {
        write_login_log(*user);
    }

    return 0;
}

请继续关注我的公众号文章