๐Ÿ“ฆ chaokunyang / bigdata-examples

bigdata examples about spark and flink

โ˜… 11 stars โ‘‚ 5 forks ๐Ÿ‘ 11 watching โš–๏ธ Apache License 2.0
bigdataflinkhadoopmonitorpythonsamplessparkspark-sqlsparkml
๐Ÿ“ฅ Clone https://github.com/chaokunyang/bigdata-examples.git
HTTPS git clone https://github.com/chaokunyang/bigdata-examples.git
SSH git clone git@github.com:chaokunyang/bigdata-examples.git
CLI gh repo clone chaokunyang/bigdata-examples
chaokunyang chaokunyang add pyspark notebook guide 9984a71 7 years ago ๐Ÿ“ History
๐Ÿ“‚ master View all commits โ†’
๐Ÿ“ bin
๐Ÿ“ common
๐Ÿ“ data_services
๐Ÿ“ docs
๐Ÿ“ flink_app
๐Ÿ“ pyspark_app
๐Ÿ“ spark_app
๐Ÿ“„ .gitignore
๐Ÿ“„ checkstyle.xml
๐Ÿ“„ LICENSE
๐Ÿ“„ pom.xml
๐Ÿ“„ README.md
๐Ÿ“„ suppressions.xml
๐Ÿ“„ README.md

Awesome Bigdata Samples

A curated list of awesome bigdata applications, deploying, operations and monitoring.

Environment

  • Java: 1.8
  • Scala: 2.11
  • Python: 2.7
  • Zookeeper: 3.4.6
  • Hbase: 1.0.3
  • Kafka: 0.10.0.1
  • Redis: 3.2.6
  • Hadoop: 2.6.5
  • Spark: 2.2.1
  • Flink: 1.4.0

applications

  • Spark Application
  • Flink Application

deploying

Operate a server cluster is not easy. Write some scripts can help us ease operations significantly. Here's some simple tools for this:
  • sync.sh: recursively synchronize the files of current directory or specified directory and sub directory to same directory of all servers specified in hosts file.
  • del.sh: delete current directory or specified directory of all servers specified in hosts file
  • dist_run.sh: run a cmd on all servers specified in hosts

operations

The scripts in awesome-bigdata-samples/bin provides some useful small operations tools to manage small and medium-sized server clusters. The details is as follows:
  • zk_admin.sh: start or stop zookeeper cluster.
  • start zookeeper cluster: ``./zk_admin.sh start%%CODEBLOCK0%%./zk_admin.sh stop%%CODEBLOCK1%%./kafka_admin.sh start%%CODEBLOCK2%%./kafka_admin.sh stop%%CODEBLOCK3%%python rerun.py -start 2017/11/21 -end 2017/12/01 -task dayJob.sh%%CODEBLOCK4%%shell mvn clean package -DskipTest -Pbuild-jar ``

Contribute

  • Source Code: https://github.com/chaokunyang/awesome-bigdata-samples
  • Issue Tracker: https://github.com/chaokunyang/awesome-bigdata-samples/issues

LICENSE

This project is licensed under Apache License 2.0.