Hahally's BLOG

- 只想做个无关紧要的副词 -

0%

CopyText

  auth : hahally

  start : 2020.1.11

后台脚本运行

1
nohup python my.py >> my.log 2>&1 &

colab 长连接脚本

1
2
3
4
5
function ClickConnect(){
console.log("Clicked on connect button");
document.querySelector("colab-connect-button").click()
}
setInterval(ClickConnect,60000)

Django

常用命令 > django-admin startproject locallibrary # 创建项目 > python manage.py startapp catalog # 创建应用 > python manage.py runserver # 启动服务 > python manage.py makemigrations # 数据库迁移 > python manage.py migrate > python manage.py createsuperuser # 创建管理员账号

views.py
    posts.content = markdown.markdown(
    posts.content,
    extensions = [
        # 包含 缩写、表格等常用扩展
        'markdown.extensions.extra',
        # 语法高亮扩展
        'markdown.extensions.codehilite',
        ]
    )

Scrapy 常用命令

          scrapy startproject proj     # 创建项目
          scrapy crawl spider_name     # 运行爬虫

python 第三方库安装源

清华大学镜像
https://pypi.tuna.tsinghua.edu.cn/simple/
阿里云
http://mirrors.aliyun.com/pypi/simple/
中科大镜像
https://pypi.mirrors.ustc.edu.cn/simple/
豆瓣镜像
http://pypi.douban.com/simple/
中科大镜像2
http://pypi.mirrors.ustc.edu.cn/simple/

笔记

      Auth       : hahally
createTime       : 2019.10.26
  abstract       : 大数据辅修学习笔记

jdk环境变量配置

    在、/etc/profile或~/.bashrc中的文件底部
    JAVA_HOME=/usr/java/jdk1.8.0_162
    JRE_HOME=$JAVA_HOME/jre
    CLASS_PATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib
    PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin
    export JAVA_HOME JRE_HOME CLASS_PATH PATH
    [root@master~]# source /etc/profile   使配置生效

windows dos 命令

    C:\Users\ACER>netstat -aon|findstr "8081"      查看端口号
    C:\Users\ACER>taskkill /f /t /im 10144         杀掉进程

Linux命令

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
[root@master~]# tar -zxvf [*].tar.gz -C [路径]   解压
[root@master~]# yum -y remove firewalld 卸载防火墙
[root@master~]# systemctl stop/status/start firewalld 停止/查看状态/启动/防火墙服务
[root@master~]# netstat -tunlp|grep 端口号 查看端口占用情况
[root@master~]# sudo passwd root 设置root密码
[root@master~]# sudo ln -s /usr/local/jdk1.8.0_162/bin/ bin 创建软链接
[root@master~]# cp [-r] file/filedir filepath 复制文件或目录

ubuntu ens33丢失重连
[root@master~]# sudo service network-manager stop
[root@master~]# sudo rm /var/lib/NetworkManager/NetworkManager.state
[root@master~]# sudo service network-manager start
[root@master~]# sudo gedit /etc/NetworkManager/NetworkManager.conf #(把false改成true)
[root@master~]# sudo service network-manager restart

centos ens33丢失重连
[root@master~]# systemctl stop NetworkManager
[root@master~]# systemctl disable NetworkManager
[root@master~]# sudo ifup ens33 重新连接ens33
[root@master~]# systemctl restart network
[root@master~]# systemctl start NetworkManager

[root@master~]# sudo ps -e |grep ssh 查看ssh服务是否启动

git 命令

1
2
3
4
5
6
git init   初始化
git add filename 将上传文件加到缓冲区
git commit [-m] [注释]
git remote add origin https://github.com/[用户名名]/[仓库名].git
git push -u origin master -f 上传到远程仓库分支
git clone https://github.com/[用户名名]/[仓库名].git 拉取代码

docker命令

1
2
3
4
5
6
7
[root@master~]# sudo docker run -it -v /home/hahally/myimage:/data --name slave2 -h slave2 new_image:newhadoop /bin/bash      运行容器指定共享目录
[root@master~]# sudo docker start slave2 启动容器
[root@master~]# sudo docker exec -i -t s2 /bin/bash 进入容器
[root@master~]# docker commit master new_image:tag 提交容器
[root@master~]# sudo docker rm contianername 删除容器
[root@master~]# sudo docker rmi imagesname 删除镜像
[root@master~]# sudo docker rename name1 name2 重新命名容器

hadoop命令

1
2
3
4
[root@master~]# hadoop dfsadmin -report      命令查看磁盘使用情况
[root@master~]# hadoop jar hadoop-mapreduce-examples-2.7.5.jar wordcount /wordcount/input /wordcount/output 运行jar包
[root@master~]# hadoop dfsadmin -safemode leave 退出安全模式
[root@master~]# hadoop jar x.jar MainClassName[主类名称] [inputPath] [outputPath]

运行hadoop自带MapReduce程序

1
2
3
4
5
[root@master hadoop-2.7.5]# hadoop fs -mkdir -p /wordcount/input              [创建一个目录]
[root@master hadoop-2.7.5]# hadoop fs -put a.txt b.txt /wordcount/input [将文件上传到input文件夹中]
[root@master hadoop-2.7.5]# cd share/hadoop/mapreduce/ [进入程序所在目录]
[root@master mapreduce]# hadoop jar hadoop-mapreduce-examples-2.7.5.jar wordcount /wordcount/input /wordcount/output [运行jar包]
[root@master mapreduce]# hadoop fs -cat /wordcount/output/part-r-00000 [查看输出结果]

Spark

环境变量

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
vim ~/.bashrc
# 现在我们的环境变量配置看起来像这样
export HADOOP_HOME=/usr/local/hadoop
export SPARK_HOME=/usr/local/spark
export PYTHONPATH=$SPARK_HOME/python:$SPARK_HOME/python/lib/py4j-0.10.7-src.zip:$PYTHONPATH
export PYSPARK_PYTHON=python3
export JAVA_HOME=/usr/local/java/jdk1.8.0_171
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/jre/lib/rt.jar:${JAVA_HOME}/lib/dt.jar:${JAVA_HOME}/lib/tools.jar
export PATH=$PATH:${JAVA_HOME}/bin
export PATH=$PATH:/usr/local/hadoop/bin:/usr/local/hadoop/sbin:/usr/local/spark/bin:/usr/local/spark/sbin:$PATH
export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native:$LD_LIBRARY_PYTHON

# 使配置生效
source ~/.bashrc
spark-env.sh
1
2
3
4
5
6
7
8
cd /usr/local/spark
cp ./conf/spark-env.sh.template ./conf/spark-env.sh
vim ./conf/spark-env.sh
# 在最后一行添加如下配置信息:

export SPARK_DIST_CLASSPATH=$(/usr/local/hadoop/bin/hadoop classpath)
export HADOOP_CONF_DIR=/usr/local/hadoop/ect/hadoop
export YARN_CONF_DIR=/usr/local/hadoop/etc/hadoop
运行
1
2
3
4
/usr/local/spark/bin/run-example SparkPi # 运行例子
/usr/local/spark、bin/run-example SparkPi 2>&1 | grep "Pi is roughly"
/usr/local/spark/bin/spark-submit ../examples/src/main/python/pi.py 2>&1 | grep 'Pi'
/usr/local/spark/bin/spark-submit --master yarn --deploy-mode cluster /usr/local/spark/examples/src/main/python/wordcount.py hdfs://master:9000/words.txt

注意点

运行jar包时,先删掉 /output文件夹,否则无法发查看输出结果

1
2
3
4
5
6
7
8
export JAVA_HOME=/usr/local/jdk1.8.0_162
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export HADOOP_HOME=/usr/local/hadoop-2.7.5
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:/usr/local/hbase-1.3.6/bin
export HBASE_HOME=/usr/local/hbase-1.3.6
export HBASE_CLASSPATH=/usr/local/hbase-1.3.6/lib/hbase-common-1.3.6.jar:/usr/local/hbase-1.3.6/lib/
hbase-server-1.3.6.jar