Posts

Showing posts from 2018

Dask Cluster

Follow below steps to set up Dask Cluster OS - Ubuntu 16.04 Cloud - AWS # Install pip sudo apt-get update # Install pip sudo apt-get install python-pip # Install Docker $ curl -fsSL https://get.docker.com -o get-docker.sh $ sh get-docker.sh # Add your user to the  docker  group. sudo usermod -aG docker $USER # install git # clone https://github.com/dask/dask-docker $ git clone https://github.com/dask/dask-docker #go to  cd dask-docker/ directory # install docker compose $  sudo curl -L "https://github.com/docker/compose/releases/download/1.23.1/docker-compose- $( uname -s ) - $( uname -m ) " -o /usr/local/bin/docker-compose sudo chmod +x /usr/local/bin/docker-compose # executable permission given # test installation docker-compose --version # docker-compose up # Run shceduler docker run -it --network host daskdev/dask # run worker docker run -it --network host daskdev/dask dask-worker 10.0.0.60:87...

Machine Learning

Machine Learning is subset of AI ( Artificial intelligence) Machine Learning Workflow: Understanding problem and objective Reading from data sources EDA - Exploratory Data Analysis Data Cleaning Modeling Deployment and Reporting Use Algorithms which figures out rules for us instead of you (developer) writing rules for each input. INPUT -> Classifier ->OUTPUT Classifier - we train classifier. Classifier is function which take data as input and assigns labels to it as an output. e.g. decision is a type of classifier "if classifier is box of rules than learning algorithm is procedure that creates them" since it finds patterns in training data in scikit training algorithm is included as fit() function. Supervised Learning: collect & Train data -> Train Classifier -> Make prediction Training Data: feature 1, feature 2, .... feature n -> Label e.g. weight, Texture -> Label 150g, Smooth -> Apple 130g, Bumpy-> O...

Blockchain with Java

Creating Your First Blockchain with Java https://medium.com/programmers-blockchain/create-simple-blockchain-java-tutorial-from-scratch-6eeed3cb03fa

How to install latest HDP cluster using Ambari

Below steps are some important points while doing installing HDP using Ambari. process was followed on RHEL 7.x on AWS. I have consider 1 ambari node, 1 master node and 2 slaves nodes 1. Collect Information The fully qualified domain name (FQDN) of each host in your system. The Ambari Cluster Install wizard supports using IP addresses. You can use $ hostname -f Start VMs on AWS and jot down below example 10.0.0.116 ip-10-0-0-116.us-west-2.compute.internal ambari 10.0.0.164 ip-10-0-0-164.us-west-2.compute.internal master1 10.0.0.145 ip-10-0-0-145.us-west-2.compute.internal slave1 10.0.0.123 ip-10-0-0-123.us-west-2.compute.internal slave2  example HOSTNAME=ip-10-0-0-116.us-west-2.compute.internal HOSTNAME=ip-10-0-0-164.us-west-2.compute.internal HOSTNAME=ip-10-0-0-145.us-west-2.compute.internal HOSTNAME=ip-10-0-0-123.us-west-2.compute.internal 2.  Set Up Password-less SSH Steps 1. Generate public and private SSH keys on the Ambari Server host....

How to install java and docker on Linux (Ubuntu)

Intsllation steps on Ubuntu ubuntu/images/hvm-ssd/ubuntu-xenial-16.04-amd64-server-20180126 (ami-79873901) ----------------------------------------------------------------------------- sudo apt-get update sudo add-apt-repository ppa:webupd8team/java sudo apt-get update sudo apt-get install oracle-java8-installer # Managing java sudo update-alternatives --config java Setting the JAVA_HOME Environment Variable sudo update-alternatives --config java sudo vi /etc/environment # INSERT JAVA_HOME="/usr/lib/jvm/java-8-oracle" source /etc/environment echo $JAVA_HOME ********************Get docker now*************** # This script is meant for quick & easy install via: $ curl -fsSL get.docker.com -o get-docker.sh $ sh get-docker.sh # docker installed properly $ docker info $ docker version

Big data

Following topics will be covered: BigData stack installation Business case Development 1. Installation: Best way to go for this step is to have Ubuntu. I found this OS to be very friendly. you can go ahead with 14 Long term license for installation. Use Ubuntu site to download OR Note: you can try before actually installing. you can also make bootable USBfrom windows and try this USB -OS first before going for actual installation.   1.1. Which file system ? : HDFS -  1.2 Which data management Layer ? - YARN 2. Business Use case: It is very important to understand that bid data world or hadoop world is for problems which have 5 V's . volume, variety, velocity, value and veracity. More technically speaking if you are not going to have more than 5 datanodes no point using hdfs or less than millions/billions of of records thinks twice to use Hbase. Also check the data generation activity e.g. is it machine or human generated. 3. Development: Ecl...

10 min walk with Odata

Ok I understand now's it data on top of REST..... :) I really don't understand the pace at which IT is changing.. It was just few months back we started talking about REST API. Some of the geeks stopped talking about SOAP and started conversation on REST. The potential key point was introduction of mobile and some governance issue around SOA. Many of the sessions and seminars started talking about REST principals and architectural style. Giant companies also started aligning themselves around this concept. Many API tools emerged. Than suddenly people started talking about writing query itself in URL What ???? ...... U mean writing select query in URL itself . Answer is yes. Obviously with little syntax change. So what about writing joins, unions etc.. They all possible. OData  specification supports all this. http://www.odata.org/ Don't miss the bus ... This is important from today's multi-channel application perspective.

Active Directory Pattern on Cloud

coming soon....

What is the difference between VM and Containers

VM stands for virtual machines: Name it self tells you that computer hardware has been written as software and provided to you. It has its Operating system and all.... Containers: Linux kernel has some specific system calls like cgroup etc  which can be leveraged o create separate processes called containers. It means same OS and top of it many small processes or lighter VM's running called as containers. Future is moving towards containers. examples are dockers LxC rkt etc since we can have thousands of containers developed and assembled to called as project hence we need tool to orchestrate (e.g. resource management)  this. One of such tools is kebernetes.