Article May 26, 2022

Delta Lake

Words count 11k Reading time 10 mins.

https://docs.delta.io/1.2.1/delta-utility.html ReleaseDelta Lake 2.0.x - for spark 3.2 Delta Lake 2.... Read article

Article May 26, 2022

Delta Lake

Words count 11k Reading time 10 mins.

https://docs.delta.io/1.2.1/delta-utility.html ReleaseDelta Lake 2.0.x - for spark 3.2 Delta Lake 2.1/2.x - for spark 3.3 入口sql12345678910spark-sql \--jars delta-core_2.12-2.0.2.jar,delta-storage-2.0.2.jar \--conf "spark.sql.extensions=io.d... Read article

CHEATSHEET March 30, 2022

zeppelin

Words count 10k Reading time 9 mins.

https://zeppelin.apache.org/download.html cp conf/zeppelin-site.xml.template conf/zeppelin-site.xmlvi conf/zeppelin-site.xml修改zeppelin-site.xml指定绑定的ip和port bin/zeppelin-daemon.sh start bin/zeppelin-daemon.sh stop 支持的组件 12345a... Read article

Article February 24, 2022

Shell

Words count 101k Reading time 1:32

读取文件为参数12ans_lines=`sed -n 1p ${verf_file}`ans_md5=`sed -n 2p ${verf_file}` if-else1234567891011121314# value equalif [ ${ans_lines} -eq ${lines} ]then echo "lines is ok"else echo "lines is not o... Read article

Article February 21, 2022

Daily

Words count 478 Reading time 1 mins.

Auto Hot Key快捷替换输入 123::pop::SendInput 13800138000return Read article

BIGDATA January 28, 2022

OptTechs

Words count 2.3k Reading time 2 mins.

HDFS Short-Circuit Local Reads dfs.client.read.shortcircuit co-located client read data file directly bypass DataNode process YARN Multi Local Dirs yarn.nodemanager.local-dirs use for multi local disks (JBOD) Spark Shuffle Service spark.shuffle.manag... Read article

CHEATSHEET January 25, 2022

hive

Words count 17k Reading time 16 mins.

提取数据为csv1insert overwrite directory '/tmp/blog' row format delimited fields terminated by ',' STORED AS TEXTFILE select * from tbl where concat(year,month,day) = '20210721' and phoneNumber='xx' and platform=&#x... Read article

CHEATSHEET January 20, 2022

TF-ML

Words count 19k Reading time 17 mins.

https://tensorflow.google.cn/resources/learn-ml/theoretical-and-advanced-machine-learning?hl=zh-cn https://tensorflow.google.cn/tutorials/keras/classification?hl=zh-cn http://c.biancheng.net/view/1881.html https://mirrors.tuna.tsinghua.edu.cn/help/an... Read article

CHEATSHEET January 20, 2022

python

Words count 52k Reading time 47 mins.

pip curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py python get-pip.py C:\python27\scirpts下运行 easy_install pip 修改更新源linux ~/.pip/pip.conf 12[global]index-url = https://pypi.tuna.tsinghua.edu.cn/simple windows C:\Users\xx\pip\pip.ini 1234[globa... Read article

CHEATSHEET January 20, 2022

qt

Words count 35k Reading time 31 mins.

版本说明PyQt6和PySide6都是用于调用Qt6API的Python库,使用它们可以轻松在Python语言中创建基于Qt的GUI程序;PyQt6和PySide6最大的不同表现在发行许可上; PyQt6是由RiverbankComputing公司开发,出现的比较早;它采用GPLv3许可证和商业许可证发布;这表示你如果使用PyQt6,则必须将你的代码进行开源;如果要闭源,则需要购买商业许可; PySide6是Qt官方的库亲儿子,出现的时间要比PyQt晚的多,这也是很多人知道PyQt不知道PySi... Read article

CHEATSHEET January 20, 2022

doris

Words count 12k Reading time 11 mins.

架构FE(Frontend) 1-5台(分为 Follower 和 Observer),存储元数据,包括日志和 image,通常从几百 MB 到几个 GB 不等。 BE(Backend) 10-100台,存放用户数据。3副本。 Broker 是用于访问外部数据源(如 hdfs)的进程。通常,在每台机器上部署一个 broker 实例即可。 一台机器上可以部署多个 BE 实例,但是只能部署一个 FE。多个FE所在服务器的时钟必须保持一致(允许最多5秒的时钟偏差) Setup123456cat /pr... Read article
Load more
0%