This month's session is about Applied Machine Learning (ML) - a test personal project I am working on, the reasons thereof, and the technology underneath.
The project uses APIs from Cloud vendors to sift through satellite images.
The goal today is to start a discussion around Emerging Tech at NASA.
1. I N T R O
All-hands Knowledge-sharing Lunch!
• This month's session is about Applied Machine Learning (ML) - a test
personal project I am working on, the reasons thereof, and the technology
underneath.
• The project uses APIs from Cloud vendors to sift through satellite images.
• The goal today is to start a discussion around Emerging Tech at NASA.
Applied ML - Harsh Prakash 1
2. Applied ML - Harsh Prakash 2
W H A T I S M L ?
• ML, a subset of AI and a superset of DL, enables a user to perform
specific tasks, like predicting outcomes and recognizing images, without
explicit instructions by analyzing and learning from data based on
patterns and inference, and with minimal human intervention.
• NLP, a subset of AI, helps a user read, analyze, interpret and understand
natural language data, and perform speech recognition.
• AI helps a user’s computer systems learn (acquire data and the rules
governing its use), reason (reach conclusions), problem-solve and self-
correct to inform its decisions.
• Neural Network is at the heart - Designed to recognize patterns (variables
that rise and fall together).
3. L I V E D E M O *
• Web app on Apache uses AWS SDK for Rekognition API connected to a video
camera for near real-time image analysis.
• ML assigns LABELS/TAGS, and returns raw JSON response from the Model API.
• Can adjust MAXLABELS, MINCONFIDENCE, etc., be ported to Lambda/S3 Bucket,
and send alerts.
* Service currently available in AWS GovCloud (US-West) only.
Applied ML - Harsh Prakash 3
5. Applied ML - Harsh Prakash 5
COLLEGE PROJECT *
Growth Study for Charlottesville VA, 2000-2030
Annual Scholarship, 2001
Used satellite images and Census data to compute population growth
distribution –
• Divided study area of the county into 5,745 grid cells (250 meters x 250
meters).
• Traditional compute model assigned growth weights based on development
indicators at the neighborhood level.
* https://www.slideshare.net/gisblog/gis-growth-study-for-charlottesville-va-20002030-plan-885-vamlis-2001-38716260
Development Indicator
6. TEST PROJECT *
• As volunteer Directors, our focus is on mapping poverty hotspots.
• Using Cloud-based ML model with satellite images to detect development
indicators at the neighborhood level.
* https://www.globalmapaid.org/patron-directors/
Applied ML - Harsh Prakash 6
7. STEPS
1. Opened account with Google Cloud Platform (GCP).
2. Enabled Google Maps API for project.
3. Enabled billing for project to fetch more than 1 satellite image per day
using API key.
4. Tuning model for known test areas. E.g. New York...
• Using satellite images for Ethiopia’s capital, Addis Ababa, from Google
Maps API at their highest available resolution (zoom: 17, or 1x1 sq.
mile).
• Using Cloud-based ML model to classify satellite images by infrastructure
levels.
• Assuming correlation between infrastructure and visual indicators in
satellite images.
Applied ML - Harsh Prakash 7
8. Bridge – New York City, NY
ML assigns labels:
Nature, Outdoors, Landscape, Scenery
Applied ML - Harsh Prakash 8
City Center – New York
City, NY
ML assigns labels:
Outdoors, Nature,
Landscape, Scenery,
Urban, Building,
Neighborhood, Road,
Housing, City, Town,
Intersection
Rural Town of Cazenovia, NY
ML assigns labels:
Landscape, Outdoors, Nature,
Scenery, Aerial View, Land, Urban,
Road, Housing, Building, Yard,
Neighborhood
KNOWN
TEST
AREAS
9. FINDINGS FROM KNOWN TEST AREAS
• For the City Center in New York City, NY – ML assigns labels “Urban” with
a 94% confidence. For the rural Town of Cazenovia, NY – ML assigns labels
“Urban” with a 76% confidence: A typical gap of about 15% points between
True Positive (TP) and False Positive (FP).
• Hybrid, Roadmap and Terrain images add noise.
• Real world applicability – If it reinforces what people on the ground
already know, it would be really helpful to Global MapAid donors and
volunteers. Applied ML - Harsh Prakash 9
Urban
Rural
City of Addis Ababa
10. TODO
• Use other datasets to augment data for BI applications. E.g. Census, IRS,
web searches, survey data from USAID and World Bank, etc.
• Use K-Nearest neighbors algorithm (k-NN) for pattern recognition to
predict for blind spots, and transform ML labels to vector.
• Use Cloud-based ML to identify patterns early and predict natural
disasters using weather data, food data and agricultural data.
• If ground volunteers or local mining companies confirm charcoal fires
and/or cooking burners on satellite images, then tune model further.Applied ML - Harsh Prakash 10
Regression for website visitor profile
using Census data
Automatic clustering of popular searches
on medlineplus.gov for May, 2015, using R
STAT, PostGIS
11. Applied ML - Harsh Prakash 11
POTENTIAL AT NASA
• ML and geoanalytics to explore LANDSAT data, and satellite and HELIOS
images –
• Modeling, Analysis and Prediction (MAP) Program – Black Marble maps of
night lights to gain insight on human activity.
• Auto-tagging of media – image, audio and video. E.g. Training videos.
• Log and text analyses.
• Smarter storage. E.g. S3 Intelligent Tiering.
• Solar storms.
12. Applied ML - Harsh Prakash 12
POTENTIAL AT NASA
Solar Storm
ML assigns labels:
Nature, Flare, Light,
Outdoors, Sun, Sky,
Night, Astronomy,
Universe, Outer Space,
Space, Moon, Sunrise,
Mountain, Planet
Solar Storm
ML assigns labels:
Night, Nature, Space,
Outdoors, Universe, Moon,
Astronomy, Outer Space,
Sun, Sky, Flare, Light,
Mountain, Photo,
Photography
13. NEXT STEPS
• Model as a Service – ML Models on AWS Marketplace.
• Frameworks and Tools – Rekognition, Google Vision, Microsoft Computer
Vision, TensorFlow, PyTorch, Jupyter Notebook, AWS SageMaker, R STAT.
• Questions?
Applied ML - Harsh Prakash 13
This
presentation’s
word cloud