python中内置函数map与模块concurrent.futures中map函数探讨

python中内置函数map

Return an iterator that applies function to every item of iterable, yielding the results.

1
a = map(fun, iterable)

a是一个迭代器,是惰性的;换句话说就是,执行上面一句代码,fun函数并没有执行,只有调用next(a)时,才会执行fun函数(体现了迭代器的特点:按需获取)。

验证的例子如下:

1
2
3
4
5
6
7
8
9
10
11
12
>>> def fun(value):
... print('I have be run %d' % value)
... return value
...
>>> a = map(fun, range(5))
>>> next(a)
I have be run 0
0
>>> next(a)
I have be run 1
1
>>>

生成器原理和这差不多。

concurrent.futures中map函数

Equivalent to map(func, *iterables) except func is executed asynchronously and several calls to func may be made concurrently.

官方文档说和内置map函数差不多,只不过函数func可能被同时调用。但实际情况是有一点差别的,下面进行说明。

1
a = executor.map(func, *iterables, timeout=None, chunksize=1)

a是一个生成器,是惰性的,但是执行上面一行代码后,func函数执行了(注意和内置map函数区别);这个返回的生成器只是针对函数func返回的结果而言的。
验证的例子如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
from concurrent.futures import ThreadPoolExecutor
import threading
import time
def task(name):
print("name",name)
time.sleep(1)
return 1

if __name__ == "__main__":
start = time.time()
ex = ThreadPoolExecutor(5)
res = ex.map(task,range(5))
#print(next(res))
ex.shutdown(wait=True)
#print(list(res))

print("main")
end = time.time()
print(end - start)

结果:

1
2
3
4
5
6
7
8
>>> python3 test.py
name 0
name 1
name 2
name 3
name 4
main
0.002694368362426758

为进一步探究其原因,看一下concurrent.futures中map函数的源码。
模块的位置如下:

1
2
3
4
>>> import concurrent.futures
>>> print(concurrent.futures.__file__)
/usr/lib/python3.5/concurrent/futures/__init__.py
>>>

对应map源码的位置/usr/lib/python3.5/concurrent/futures/_base.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
def map(self, fn, *iterables, timeout=None, chunksize=1):
if timeout is not None:
end_time = timeout + time.time()

fs = [self.submit(fn, *args) for args in zip(*iterables)]

# Yield must be hidden in closure so that the futures are submitted
# before the first iterator value is required.
def result_iterator():
try:
for future in fs:
if timeout is None:
yield future.result()
else:
yield future.result(end_time - time.time())
finally:
for future in fs:
future.cancel()
return result_iterator()

根据源码可知,第5行列表推到,函数fun已经被调用,最后返回是fun函数调用结果的生成器。和上面的讨论一致。

结束。