[root@node1 bin]# ./pyspark Python 3.8.8 (default, Apr 13 2021, 19:58:26) [GCC 7.3.0] :: Anaconda, Inc. on linux Type "help", "copyright", "credits" or "license" for more information. ll 22/02/01 11:50:45 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 22/02/01 11:50:46 WARN HiveConf: HiveConf of name hive.metastore.event.db.notification.api.auth does not exist 22/02/01 11:50:48 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041. Welcome to ____ __ / __/__ ___ _____/ /__ _ / _ / _ `/ __/ '_/ /__ / .__/_,_/_/ /_/_ version 3.1.2 /_/ Using Python version 3.8.8 (default, Apr 13 2021 19:58:26) Spark context Web UI available at http://node1:4041 Spark context available as 'sc' (master = local[*], app id = local-1643687448593). SparkSession available as 'spark'. >>> ll Traceback (most recent call last): File "", line 1, in NameError: name 'll' is not defined >>> rdd = sc.parallelize([('a',1),('a',11),('b',3),('b',5)]) >>> rdd.map(lamber x:(x[0],x[1]*10)).collect() File " ", line 1 rdd.map(lamber x:(x[0],x[1]*10)).collect() ^ SyntaxError: invalid syntax >>> rdd.map(lambda x:(x[0],x[1]*10)).collect() [('a', 10), ('a', 110), ('b', 30), ('b', 50)] >>> rdd.mapValues(lambda value: value * 10).collect() [('a', 10), ('a', 110), ('b', 30), ('b', 50)] >>>