「 Article 」
May 26, 2022
Words count
11k
Reading time
10 mins.
https://docs.delta.io/1.2.1/delta-utility.html ReleaseDelta Lake 2.0.x - for spark 3.2 Delta Lake 2.1/2.x - for spark 3.3 入口sql12345678910spark-sql \--jars delta-core_2.12-2.0.2.jar,delta-storage-2.0.2.jar \--conf "spark.sql.extensions=io.d...
Read article
「 CHEATSHEET 」
March 30, 2022
Words count
10k
Reading time
9 mins.
https://zeppelin.apache.org/download.html cp conf/zeppelin-site.xml.template conf/zeppelin-site.xmlvi conf/zeppelin-site.xml修改zeppelin-site.xml指定绑定的ip和port bin/zeppelin-daemon.sh start bin/zeppelin-daemon.sh stop 支持的组件 12345a...
Read article
「 Article 」
February 24, 2022
Words count
101k
Reading time
1:32
读取文件为参数12ans_lines=`sed -n 1p ${verf_file}`ans_md5=`sed -n 2p ${verf_file}` if-else1234567891011121314# value equalif [ ${ans_lines} -eq ${lines} ]then echo "lines is ok"else echo "lines is not o...
Read article
「 Article 」
February 21, 2022
Words count
478
Reading time
1 mins.
Auto Hot Key快捷替换输入 123::pop::SendInput 13800138000return
Read article
「 BIGDATA 」
January 28, 2022
Words count
2.3k
Reading time
2 mins.
HDFS Short-Circuit Local Reads dfs.client.read.shortcircuit co-located client read data file directly bypass DataNode process YARN Multi Local Dirs yarn.nodemanager.local-dirs use for multi local disks (JBOD) Spark Shuffle Service spark.shuffle.manag...
Read article
「 CHEATSHEET 」
January 25, 2022
Words count
17k
Reading time
16 mins.
提取数据为csv1insert overwrite directory '/tmp/blog' row format delimited fields terminated by ',' STORED AS TEXTFILE select * from tbl where concat(year,month,day) = '20210721' and phoneNumber='xx' and platform=...
Read article
「 CHEATSHEET 」
January 20, 2022
Words count
19k
Reading time
17 mins.
https://tensorflow.google.cn/resources/learn-ml/theoretical-and-advanced-machine-learning?hl=zh-cn https://tensorflow.google.cn/tutorials/keras/classification?hl=zh-cn http://c.biancheng.net/view/1881.html https://mirrors.tuna.tsinghua.edu.cn/help/an...
Read article
「 CHEATSHEET 」
January 20, 2022
Words count
52k
Reading time
47 mins.
pip curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py python get-pip.py C:\python27\scirpts下运行 easy_install pip 修改更新源linux ~/.pip/pip.conf 12[global]index-url = https://pypi.tuna.tsinghua.edu.cn/simple windows C:\Users\xx\pip\pip.ini 1234[globa...
Read article
「 CHEATSHEET 」
January 20, 2022
Words count
35k
Reading time
31 mins.
版本说明PyQt6和PySide6都是用于调用Qt6API的Python库,使用它们可以轻松在Python语言中创建基于Qt的GUI程序;PyQt6和PySide6最大的不同表现在发行许可上; PyQt6是由RiverbankComputing公司开发,出现的比较早;它采用GPLv3许可证和商业许可证发布;这表示你如果使用PyQt6,则必须将你的代码进行开源;如果要闭源,则需要购买商业许可; PySide6是Qt官方的库亲儿子,出现的时间要比PyQt晚的多,这也是很多人知道PyQt不知道PySi...
Read article
「 CHEATSHEET 」
January 20, 2022
Words count
12k
Reading time
11 mins.
架构FE(Frontend) 1-5台(分为 Follower 和 Observer),存储元数据,包括日志和 image,通常从几百 MB 到几个 GB 不等。 BE(Backend) 10-100台,存放用户数据。3副本。 Broker 是用于访问外部数据源(如 hdfs)的进程。通常,在每台机器上部署一个 broker 实例即可。 一台机器上可以部署多个 BE 实例,但是只能部署一个 FE。多个FE所在服务器的时钟必须保持一致(允许最多5秒的时钟偏差) Setup123456cat /pr...
Read article