Ralph – Page 17 – Developer Blog

15Jul

Apache Zeppelin | Getting Started

by Ralph Apache, Apache Zeppelin

First Steps with Zeppelin

Zeppelin and MySQL

Create a new Interpreter

Create a new interpreter

or confgure existing mysql interpreter

Configure Mysql Interpreter

Under artifact, add absoulte path of mysql-connector-java-8.0.19.jar.

Add/modify properties for

default.user

default.password

Prepare MySQL Database

Create a database user spark with password spark

Create a database spark wirth all permissions to user spark

Add demo values

Test Mysql Conection

Create a new notebook with mysql interpreter

Write sample code

select * from spark.demo;

Installation

Install with Docker

docker run -p 8080:8080 — rm — name zeppelin apache/zeppelin:0.8.1

Set docker volume options to persist notebooks and logs like

docker run -p 8080:8080 — rm -v $PWD/logs:/logs -v $PWD/notebook:/notebook -e ZEPPELIN_LOG_DIR=’/logs’ -e ZEPPELIN_NOTEBOOK_DIR=’/notebook’ — name zeppelin apache/zeppelin:0.8.1

Install in a vagrant box

Setup base Vagrant Box

vagrant init ubuntu/trusty64
vagrant up
vagrant ssh

Update Operating System

sudo apt-get update -y
sudo apt-get upgrade -y

Install the Vagrant Key

The only way that all the vagrant commands will be able to communicate over ssh from the host machine to the guest server is if the guest server has this “insecure vagrant key” installed. It’s called “insecure” because essentially everyone has this same key and anyone can hack into everyone’s vagrant box if you use it.

mkdir -p /home/vagrant/.ssh
chmod 0700 /home/vagrant/.ssh
wget --no-check-certificate \
    https://raw.github.com/mitchellh/vagrant/master/keys/vagrant.pub \
    -O /home/vagrant/.ssh/authorized_keys
chmod 0600 /home/vagrant/.ssh/authorized_keys
chown -R vagrant /home/vagrant/.ssh

Install Zeppelin and required Software

Detailed description can be found here.

sudo apt-get install -y gcc build-essential linux-headers-server
sudo apt-get install git
sudo apt-get install openjdk-7-jdk
sudo apt-get install npm
sudo apt-get install libfontconfig
sudo apt-get install r-base-dev
sudo apt-get install r-cran-evaluate

git clone https://github.com/apache/zeppelin.git
sudo apt-get -y install maven
mvn clean package -DskipTests -Pspark-2.0 -Phadoop-2.4 -Pr -Pscala-2.11

Configure Zeppelin

28Jun

Apache Spark | Getting started

by Ralph Apache, Apache Spark

Apache Spark is a lightning-fast cluster computing designed for fast computation. It was built on top of Hadoop MapReduce and it extends the MapReduce model to efficiently use more types of computations which includes Interactive Queries and Stream Processing.

This is an extract from this brief tutorial that explains the basics of Spark Core programming.

Environment / Requirements

Installation on Mac OS X

Check or install java

$ java -version
java version "12.0.1" 2019-04-16
Java(TM) SE Runtime Environment (build 12.0.1+12)
Java HotSpot(TM) 64-Bit Server VM (build 12.0.1+12, mixed mode, sharing)

Check or install Scala

$ brew install scala

$ scala -version
Scala code runner version 2.13.0 -- Copyright 2002-2019, LAMP/EPFL and Lightbend, Inc.

Check or install Apache Spark

Setup environment in .bashrc

export PATH="$PATH:$SPARK_HOME/bin"

Installation on Ubuntu

Prepate Upuntu

apt update
apt upgrade

 apt-get install openjdk-8-jdk
 java -version

Links and Resources

Apache Spark in Python: Beginner’s Guide

24Jun

Jekyll | Build a Jekyll Template based on Bootstrap 4

by Ralph Bootstrap

TL;DR

Combine two amazing open source tools: Jekyll and Bootstrap. The final template is here.

Bootstrap Template and Jekyll: two powerful tools

Start Point

While i want to learn about and work with bootstrap, i decided to build a Jekyll Template, so that i can build a dynamic website.

Asking Google for first inspiration leads me to this wonderful Blog entry:

Choose a Bootstrap Template

Quite nice. So I decided to use one of the free templates from Start Bootstrap: Modern Business

When i downloaded the template from Github and examine the content, i find out, that for each component (Pricing, Service, Contact), there is a corresponding HTML-file with all the content and all the formatting code:

about.html
blog-home-1.html
blog-home-2.html
blog-post.html
contact.html
faq.html
full-width.html
index.html
portfolio-1-col.html
portfolio-2-col.html
portfolio-3-col.html
portfolio-4-col.html
portfolio-item.html
pricing.html
services.html
sidebar.html

The Plan

My plan was to separate the presentation layer (what you will see) from the business layer (what creates the content for the presentation layer).

To achieve this with Jekyll, i convert the Bootstrap pages to Jekyll include pages. The final result should look like this:

The frontpage for the component

The jekyll include file with the component

---
layout: page
title: Services
---
<div class="container">
    <h1 class="mt-4 mb-3">{{ page.title }}</h1>
    {
</div>

{

<h2>Services: {{ site.services.title }}</h2>

<!-- Image Header -->
<img class="img-fluid rounded mb-4" src="{{ images }}/header.jpg" alt="">

<div class="row">
    {
        <div class="col-lg-4 mb-4">
            <div class="card h-100">
                <h4 class="card-header">{{ item.title }}</h4>
                <div class="card-body">
                    <p class="card-text">{{ item.text | markdownify }}</p>
                </div>
                <div class="card-footer">
                    <a href="#" class="btn btn-primary">Learn More</a>
                </div>
            </div>
        </div>
    {
</div>

Next step was to convert every Bootstrap Template Page to a Jekyll Include File

About Page

FAQ Page

Portfolio Page with 1 Column

Portfolio Page with 2 Column

Services Page

Pricing Page

The main challenge in separating the presentation from the business layer was: where to place the data to be displayed?

Depending on the type of the component, i choose three different solutions:

Place the data in the corresponding include file of the component
Place the date in the page, which calls the corresponding include file of the component
Place the data in a Jekyll collection file

Data in corresponding include file of the component

I used this approach for components, which are used only once on the website and have a mostly static content, e.g. the FAQ Page

The component page

The frontend page

Date in the page, which calls the corresponding include file of the component

I used this approach for components, which are used more than once on the website, e.g. a Blog Post

The component page

The frontend page

Data in a Jekyll collection file

I used this approach for components, which are used only once on the website, but needs more configuration information, e.g. the Services- or Portfolio Page.

This step needs an additional configuration task: create the Jekyll Collections.

Jekyll collections are a great way to group related content like members of a team or talks at a conference.

To use a Collection you first need to define it in your _config.yml.

#
collections_dir: collections # folder, where collections files are stored
collections:
  services:
    title: "Services"
    output: true # store output files for each item under the collections folder

Then, you have to create the collection files, for each item in your collection one file:

These files look like this:

---
img: 1.jpg
title: Development
subtitle: 
footer: 
text: Lorem ipsum dolor sit amet, consectetur adipisicing elit. Possimus aut mollitia eum ipsum fugiat odio officiis odit.
---

And the data of this files can be accessed in the Jekyll include file with this code fragment:

all items of the collection:

site.services, {{
    <div class="col-lg-4 mb-4">
        <div class="card h-100">
            <h4 class="card-header">{{ item.title }}</h4>
            <div class="card-body">
                <p class="card-text">{{ item.text | markdownify }}</p>
            </div>
            <div class="card-footer">
                <a href="#" class="btn btn-primary">Learn More</a>
            </div>
        </div>
    </div>
{



The final result



Bootstrap Template and Jekyll: two powerful tools

		
		   		
		    
			    
                   						
							
							18Jun
						
								
			        										
						Go | Cookbook
					
			        										
					    
							by Ralph	
						
												Cookbook, Go	
						
															
						
Tutorial



Building a Simple CLI Tool with Golang



Links



https://golang.org/
https://blog.golang.org/
https://gobyexample.com

		
		   		
		    
			    
                   						
							
							18Jun
						
								
			        										
						Jekyll | Cookbook
					
			        										
					    
							by Ralph	
						
												Cookbook, Jekyll	
						
															
						
Working with Arrays



Define the array



---
layout: post
title:  "Universe"
date:   2019-06-17 10:00:00
planets:
    - mercury 
    - venus
    - earth
Access the array
{
    <a href="https://{{planet}}.universe}">{{planet}}</a>
{



Liquid



Links



https://github.com/Shopify/liquid/wiki/Liquid-for-Designers#optional-arguments



Code Snippets



for-loop-sorted-collection



<ul>
    {
    {
    <li>{{ item.title }}</li>
    {
</ul>
{



Code Snippets and recieps




https://gist.github.com/ryerh/b2fa73829f1b7b1c39988f09a65eb227

		
		   		
		    
			    
                   						
							
							15Jun
						
								
			        										
						Learning | Path for Data Scientist
					
			        										
					    
							by Ralph	
						
												Data Science, Learning	
						
															
						
Portfolio
Python Pandas / Numpy /SciPy
Apache Spark
Apache Hadoop



Learning



https://www.coursera.org/learn/python-data-analysis/home/welcome
Introduction to Data Science in Python
https://www.coursera.org/learn/python-machine-learning/home/welcome
https://www.coursera.org/learn/progfun1/home/welcome
https://www.coursera.org/learn/hadoop/home/welcome
https://www.coursera.org/learn/machine-learning/home/welcome
https://www.coursera.org/learn/hadoop/home/welcome
https://www.coursera.org/learn/python-text-mining/home/welcome
https://www.coursera.org/learn/scala-spark-big-data/home/welcome
https://www.coursera.org/learn/python-plotting/home/welcome
https://www.coursera.org/learn/datasciencemathskills/home/welcome
https://www.coursera.org/learn/data-analysis-tools/home/welcome
https://www.coursera.org/learn/data-visualization/home/welcome
https://www.coursera.org/learn/big-data-introduction/home/welcome
https://www.coursera.org/learn/big-data-machine-learning/home/welcome



Mathematics for Data Science



Linear Algebra



Khan Academy Linear Algebra series (beginner friendly).
Coding the Matrix course (and book).
3Blue1Brown Linear Algebra series.
fast.ai Linear Algebra for coders course, highly related to modern ML workflow.
First course in Coursera Mathematics for Machine Learning specialization.
“Introduction to Applied Linear Algebra — Vectors, Matrices, and Least Squares” book.
MIT Linear Algebra course, highly comprehensive.
Stanford CS229 Linear Algebra review.



Calculus



Khan Academy Calculus series (beginner friendly).
3Blue1Brown Calculus series.
Second course in Coursera Mathematics for Machine Learning specialization.
The Matrix Calculus You Need For Deep Learning paper.
MIT Single Variable Calculus.
MIT Multivariable Calculus.
Stanford CS224n Differential Calculus review.



Statistics and Probability



Khan Academy Statistics and probability series (beginner friendly).
A visual introduction to probability and statistics, Seeing Theory.
Intro to Descriptive Statistics from Udacity.
Intro to Inferential Statistics from Udacity.
Statistics with R Specialization from Coursera.
Stanford CS229 Probability Theory review.



Bonus materials



Part one of Deep Learning book.
CMU Math Background for ML course.
The Math of Intelligence playlist by Siraj Raval.

		
		   		
		    
			    
                   						
							
							12Jun
						
								
			        										
						Learning Angular
					
			        										
					    
							by Ralph	
						
												Angular, Learning	
						
															
						
Starting with Angular



Links



Apps



https://github.com/sevilayha/angular-first-site-test/blob/master/src/app/core/services/user.service.ts 



Tutorials



https://angular.io/tutorial



https://angular-templates.io/tutorials/about/learn-angular-from-scratch-step-by-step



https://www.techiediaries.com/angular-tutorial-basics/ 



https://angular.de/artikel/angular-tutorial-deutsch/



https://tutorialzine.com/2016/09/30-learning-resources-for-mastering-angular-2








Building a Website



How I Built A Simple Website In Angular Using Bootstrap-Jumbotron



How To Build Responsive Layouts With Bootstrap 4 and Angular 6



https://medium.com/@hamedbaatour/build-a-real-world-beautiful-web-app-with-angular-6-a-to-z-ultimate-guide-2018-part-i-e121dd1d55e 
https://www.creativebloq.com/how-to/how-to-build-a-full-page-website-in-angular 
https://www.airpair.com/angularjs/building-angularjs-app-tutorial 



Templates: Website



https://startbootstrap.com/previews/modern-business/ 
https://html5boilerplate.com/  or on Github

		
		   		
		    
			    
                   						
							
							11Jun
						
								
			        										
						Hadoop | Getting started
					
			        										
					    
							by Ralph	
						
												Hadoop	
						
															
						




Modules



HDFS Hadoop’s File Share which can be local or shared depending on your setup
MapReduce Hadoop’s Aggregation/Synchronization tool enabling highly parallel processing…this is the true “engine” or time saver in Hadoop
Hive Hadoop’s SQL query window, equivalent to Microsoft Query Analyzer
Pig Dataflow scripting tool similar to a Batch job or simplistic ETL processer
Flume Collector/Facilitator of Log file information
Ambari Web-based Admin tool utilized for managing, provisioning, and monitoring Hadoop Cluster
Cassandra High-Availability, Scalable, Multi-Master database platform…RDBMS on sterioids
Mahout Machine Learning engine, which translates into, it does complex calculations, algorithmic processing, and statistical/stochastic operations using R and other frameworks…it does serious math!
Spark Programmatic based compute engine allowing for ETL, machine learning, stream processing, and graph computation
ZooKeeper Coordinator service for all your distributed processing
Oozie Workflow scheduler managing Hadoop jobs



Links



Apache



https://sentry.apache.org/



https://de.hortonworks.com/apache/ranger/  



https://mahout.apache.org/



https://pig.apache.org/



https://zookeeper.apache.org/



https://oozie.apache.org/



Diverses



http://ercoppa.github.io/HadoopInternals/

		
		   		
		    
			    
                   						
							
							5Jun
						
								
			        										
						Ionic | Advanced Know-How
					
			        										
					    
							by Ralph	
						
												Ionic, Ionic 3, Ionic 4, Mobile Development	
						
															
						
Working on Android



Start emulator



$ emulator -list-avds
6
6_x86_64
7
$ emulator @6
Show logfile messages
$ adb logcat
Run on Device
$ adb uninstall io.ionic.conference
$ ionic run android
Working on iOS
List available devices
$ ios-sim showdevicetypes
Run on Emulator
$ ionic emulate ios --target="iPhone-6, 10.1"
Run on Device

   16May SAS | Cookbook
  by Ralph  Cookbook, SAS
Handling data
Split fields




Data Cleaning



Filter out by value of an entry



if prxmatch('/^(TST|TEST|ek-test-)/', USERNAME) then
   output &_TSTDSN.;            
else
   output &_OUTDSN.;

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

HDFS	Hadoop’s File Share which can be local or shared depending on your setup
MapReduce	Hadoop’s Aggregation/Synchronization tool enabling highly parallel processing…this is the true “engine” or time saver in Hadoop
Hive	Hadoop’s SQL query window, equivalent to Microsoft Query Analyzer
Pig	Dataflow scripting tool similar to a Batch job or simplistic ETL processer
Flume	Collector/Facilitator of Log file information
Ambari	Web-based Admin tool utilized for managing, provisioning, and monitoring Hadoop Cluster
Cassandra	High-Availability, Scalable, Multi-Master database platform…RDBMS on sterioids
Mahout	Machine Learning engine, which translates into, it does complex calculations, algorithmic processing, and statistical/stochastic operations using R and other frameworks…it does serious math!
Spark	Programmatic based compute engine allowing for ETL, machine learning, stream processing, and graph computation
ZooKeeper	Coordinator service for all your distributed processing
Oozie	Workflow scheduler managing Hadoop jobs