Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Warp 10 - Time Series Analysis on top of Hadoop - HUG France - Paris Spark Meetup 2017-05-16

Tour of the Open Source Warp 10 Time Series Platform including its integration into the Hadoop Ecosystem (Spark, Pig, Flink, Storm).

  • Inicia sesión para ver los comentarios

Warp 10 - Time Series Analysis on top of Hadoop - HUG France - Paris Spark Meetup 2017-05-16

  1. 1. Spark Meetup - 2017-05-16, Paris Mathias @Herberts - CTO, Cityzen Data Warp 10 - Simplifying analysis of time series data on top of
  2. 2. `whoami` Former Senior SRE on Big Table at Google Former head of Big Data at Crédit Mutuel Arkéa Pioneer in the use of Hadoop & HBase in production since 2009 Co-Founder and CTO of Cityzen Data, maker of Warp 10 @herberts
  3. 3. Time Series
  4. 4. Time Series are everywhere
  5. 5. IoT & time series data management and analysis
  6. 6. Versatile Data Model
  7. 7. Geo Time Series®
  8. 8. Geo Time Series®
  9. 9. Digital Twin Paradigm
  10. 10. Multiple Versions
  11. 11. Embeddable for Edge Analytics
  12. 12. Standalone Version HA Datalog in-memory
  13. 13. Distributed Version
  14. 14. Secure Solution
  15. 15. Security Encryption and authentication/authorization mechanisms sandboxed environment for analytics
  16. 16. Rich Analytics
  17. 17. Analytics A stack based language dedicated to time series analytics
  18. 18. Advanced stack based language ■ Result is a JSON array of the various stack levels ■ Support for variables and context saving ■ Code serialization ■ Loops, conditionals, macros - Data Flow model ■ Secure code execution, resource limits
  19. 19. 5 high level frameworks ■ BUCKETIZE - transform a series so it has regularly spaced ticks ■ MAP - apply a function on a sliding window ■ REDUCE - tick by tick computation on multiple series, producing a single one ■ FILTER - select series based on various criteria ■ APPLY - tick by tick application of an n-ary function
  20. 20. ! != % & && * ** + +! - ->B64 ->B64URL ->BIN ->BYTES ->DOUBLEBITS ->FLOATBITS ->GEOHASH ->HEX ->HHCODE ->HHCODELONG ->JSON ->LIST ->MAP ->MAT ->OPB64 ->PICKLE ->Q ->SET ->TSELEMENTS ->V ->VEC ->Z / < << <= == > >= >> >>> ABS ACOS ADDDAYS ADDMONTHS ADDVALUE ADDYEARS AESUNWRAP AESWRAP AGO AND APPEND APPLY ASIN ASSERT ATAN ATBUCKET ATINDEX ATTICK ATTRIBUTES AUTHENTICATE B64-> B64TOHEX B64URL-> BBOX BIN-> BINTOHEX BITCOUNT BITGET BITSTOBYTES BOOTSTRAP BREAK BUCKETCOUNT BUCKETIZE BUCKETSPAN BYTES-> BYTESTOBITS BYTESTOBITS CALL CBRT CEIL CHUNK CLEAR CLEARDEFS CLEARSYMBOLS CLEARTOMARK CLIP CLONE CLONEEMPTY CLONEREVERSE COMMONTICKS COMPACT CONTAINS CONTAINSKEY CONTAINSVALUE CONTINUE COPYGEO COPYSIGN CORRELATE COS COSH COUNTER COUNTERDELTA COUNTERVALUE COUNTTOMARK CPROB CROP CSTORE CUDF DEBUGOFF DEBUGON DEDUP DEF DEFINED DEFINEDMACRO DELETE DEPTH DET DIFFERENCE DISCORDS DOC DOCMODE DOUBLEBITS-> DOUBLEEXPONENTIALSMOOTHING DROP DROPN DTW DUMP DUP DUPN DURATION DWTSPLIT E ELAPSED ELEVATIONS EMPTY ESDTEST EVAL EVALSECURE EVERY EXP EXPM1 EXPORT FAIL FDWT FETCH FETCHBOOLEAN FETCHDOUBLE FETCHLONG FETCHSTRING FFT FFTAP FILLNEXT FILLPREVIOUS FILLTICKS FILLVALUE FILTER FIND FINDSETS FINDSTATS FIRSTTICK FLATTEN FLOATBITS-> FLOOR FOR FOREACH FORGET FORSTEP FROMBIN FROMBITS FROMHEX FUSE GEO.DIFFERENCE GEO.INTERSECTION GEO.INTERSECTS GEO.REGEXP GEO.UNION GEO.WITHIN GEO.WKT GEOHASH-> GEOPACK GEOUNPACK GET GETHOOK GETSECTION GRUBBSTEST GZIP HASH HAVERSINE HEADER HEX-> HEXTOB64 HEXTOBIN HHCODE-> HUMANDURATION HYBRIDTEST HYBRIDTEST2 HYPOT IDENT IDWT IEEEREMAINDER IFFT IFT IFTE IMMUTABLE INTEGRATE INTERPOLATE INTERSECTION INV ISNULL ISNaN ISO8601 ISODURATION ISONORMALIZE JOIN JSON-> JSONLOOSE JSONSTRICT KEYLIST LABELS LASTBUCKET LASTSORT LASTTICK LBOUNDS LFLATMAP LIMIT LIST-> LMAP LOAD LOCATIONOFFSET LOCATIONS LOCSTRINGS LOG LOG10 LOG1P LORAENC LORAMIC LOWESS LR LSORT LTTB MACROBUCKETIZER MACROFILTER MACROMAPPER MACROREDUCER MAKEGTS MAP MAP-> MAPID MARK MAT-> MATCH MATCHER MAX MAXBUCKETS MAXDEPTH MAXGTS MAXLONG MAXLOOP MAXOPS MAXPIXELS MAXSYMBOLS MD5 MERGE META METASET METASORT MIN MINLONG MODE MONOTONIC MSGFAIL MSORT MSTU MUSIGMA NAME NBOUNDS NDEBUGON NEWGTS NEXTAFTER NEXTUP NONEMPTY NOOP NORMALIZE NOT NOTAFTER NOTBEFORE NOTIMINGS NOW NPDF NRETURN NSUMSUMSQ NULL NaN ONLYBUCKETS OPB64-> OPB64TOHEX OPS OPTDTW OR PACK PAPPLY PARSE PARSESELECTOR PARTITION PATTERNDETECTION PATTERNS PFILTER PGraphics PI PICK PICKLE-> PIGSCHEMA PREDUCE PROB PROBABILITY PUT Palpha Parc Pbackground PbeginContour PbeginShape Pbezier PbezierDetail PbezierPoint PbezierTangent PbezierVertex Pblend PblendMode Pblue Pbox Pbrightness Pclear Pclip Pcolor PcolorMode Pconstrain Pcopy PcreateFont Pcurve PcurveDetail PcurvePoint PcurveTangent PcurveTightness PcurveVertex Pdecode Pdist Pellipse PellipseMode Pencode PendContour PendShape Pfill Pget Pgreen Phue Pimage PimageMode Plerp PlerpColor Pline Pmag Pmap PnoClip PnoFill PnoStroke PnoTint Pnorm Ppixels Ppoint PpopMatrix PpopStyle PpushMatrix PpushStyle Pquad PquadraticVertex Prect PrectMode Pred PresetMatrix Protate ProtateX ProtateY ProtateZ Psaturation Pscale Pset PshapeMode PshearX PshearY Psphere PsphereDetail Pstroke PstrokeCap PstrokeJoin PstrokeWeight Ptext PtextAlign PtextAscent PtextDescent PtextFont PtextLeading PtextMode PtextSize PtextWidth Ptint Ptranslate Ptriangle PupdatePixels Pvertex Q-> QCONJUGATE QDIVIDE QMULTIPLY QROTATE QROTATION QUANTIZE RAND RANDPDF RANGE RANGECOMPACT REDEFS REDUCE RELABEL REMOVE RENAME REPLACE REPLACEALL RESET RESETS RESTORE RETURN REV REVBITS REVERSE REXEC REXECZ RINT RLOWESS ROLL ROLLD ROT ROTATIONQ ROUND RSADECRYPT RSAENCRYPT RSAGEN RSAPRIVATE RSAPUBLIC RSASIGN RSAVERIFY RSORT RTFM RUN RUNNERNONCE RVALUESORT SAVE SECTION SECUREKEY SET SET-> SETATTRIBUTES SETVALUE SHA1 SHA1HMAC SHA256 SHA256HMAC SHRINK SIGNUM SIN SINGLEEXPONENTIALSMOOTHING SINH SIZE SNAPSHOT SNAPSHOTALL SNAPSHOTALLTOMARK SNAPSHOTCOPY SNAPSHOTCOPYALL SNAPSHOTCOPYALLTOMARK SNAPSHOTCOPYTOMARK SNAPSHOTTOMARK SORT SORTBY SPLIT SQRT STACKATTRIBUTE STACKTOLIST STANDARDIZE STL STLESDTEST STOP STORE STRICTMAPPER STRICTPARTITION STRICTREDUCER STU SUBLIST SUBMAP SUBSTRING SWAP SWITCH TAN TANH TEMPLATE TEMPLATE THRESHOLDTEST TICKINDEX TICKLIST TICKS TIMECLIP TIMEMODULO TIMESCALE TIMESHIFT TIMESPLIT TIMINGS TLTTB TOBIN TOBITS TOBOOLEAN TODEGREES TODOUBLE TOHEX TOKENINFO TOLONG TOLOWER TORADIANS TOSELECTOR TOSTRING TOTIMESTAMP TOTIMESTAMP TOUPPER TR TRANSPOSE TRIM TSELEMENTS TSELEMENTS-> TYPEOF UDF ULP UNBUCKETIZE UNGZIP UNION UNIQUE UNLIST UNMAP UNPACK UNSECURE UNTIL UNWRAP UNWRAPEMPTY UNWRAPSIZE UPDATE URLDECODE URLENCODE UUID V-> VALUEDEDUP VALUEHISTOGRAM VALUELIST VALUES VALUESORT VALUESPLIT VEC-> WEBCALL WHILE WRAP WRAPOPT WRAPRAW WRAPRAWOPT Z-> ZDISCORDS ZIP ZPATTERNDETECTION ZPATTERNS ZSCORE ZSCORETEST [ [] ] ^ bucketizer.and bucketizer.count bucketizer.count.exclude-nulls bucketizer.count.include-nulls bucketizer.count.nonnull bucketizer.first bucketizer.join bucketizer.join.forbid-nulls bucketizer.last bucketizer.mad bucketizer.max bucketizer.max.forbid-nulls bucketizer.mean bucketizer.mean.circular bucketizer.mean.circular.exclude-nulls bucketizer.mean.exclude-nulls bucketizer.median bucketizer.min bucketizer.min.forbid-nulls bucketizer.or bucketizer.percentile bucketizer.sum bucketizer.sum.forbid-nulls d e filter.byattr filter.byclass filter.bylabels filter.bylabelsattr filter.bymetadata filter.last.eq filter.last.ge filter.last.gt filter.last.le filter.last.lt filter.last.ne filter.latencies h m mapper.abs mapper.abscissa mapper.add mapper.and mapper.ceil mapper.count mapper.count.exclude-nulls mapper.count.include-nulls mapper.count.nonnull mapper.day mapper.delta mapper.distinct mapper.dotproduct mapper.dotproduct.positive mapper.dotproduct.sigmoid mapper.dotproduct.tanh mapper.eq mapper.exp mapper.finite mapper.first mapper.floor mapper.ge mapper.geo.approximate mapper.geo.clear mapper.geo.outside mapper.geo.within mapper.gt mapper.hdist mapper.highest mapper.hour mapper.hspeed mapper.join mapper.join.forbid-nulls mapper.kernel.cosine mapper.kernel.epanechnikov mapper.kernel.gaussian mapper.kernel.logistic mapper.kernel.quartic mapper.kernel.silverman mapper.kernel.triangular mapper.kernel.tricube mapper.kernel.triweight mapper.kernel.uniform mapper.last mapper.le mapper.log mapper.lowest mapper.lt mapper.mad mapper.max mapper.max.forbid-nulls mapper.max.x mapper.mean mapper.mean.circular mapper.mean.circular.exclude-nulls mapper.mean.exclude-nulls mapper.median mapper.min mapper.min.forbid-nulls mapper.min.x mapper.minute mapper.mod mapper.month mapper.mul mapper.ne mapper.npdf mapper.or mapper.parsedouble mapper.percentile mapper.pow mapper.product mapper.rate mapper.replace mapper.round mapper.sd mapper.sd.forbid-nulls mapper.second mapper.sigmoid mapper.sum mapper.sum.forbid-nulls mapper.tanh mapper.tick mapper.toboolean mapper.todouble mapper.tolong mapper.tostring mapper.truecourse mapper.var mapper.var.forbid-nulls mapper.vdist mapper.vspeed mapper.weekday mapper.year max.tick.sliding.window max.time.sliding.window ms ns op.add op.add.ignore-nulls op.and op.and.ignore-nulls op.div op.eq op.ge op.gt op.le op.lt op.mask op.mul op.mul.ignore-nulls op.ne op.negmask op.or op.or.ignore-nulls op.sub pi ps reducer.and reducer.and.exclude-nulls reducer.argmax reducer.argmin reducer.count reducer.count.exclude-nulls reducer.count.include-nulls reducer.count.nonnull reducer.join reducer.join.forbid-nulls reducer.join.nonnull reducer.join.urlencoded reducer.mad reducer.max reducer.max.forbid-nulls reducer.max.nonnull reducer.mean reducer.mean.circular reducer.mean.circular.exclude-nulls reducer.mean.exclude-nulls reducer.median reducer.min reducer.min.forbid-nulls reducer.min.nonnull reducer.or reducer.or.exclude-nulls reducer.percentile reducer.product reducer.sd reducer.sd.forbid-nulls reducer.shannonentropy.0 reducer.shannonentropy.1 reducer.sum reducer.sum.forbid-nulls reducer.sum.nonnull reducer.var reducer.var.forbid-nulls s us w { {} | || } ~ ~= 800 functions
  21. 21. Compact expressiveness <% ‘Display write requests count for each region’ DOC SAVE 'context' STORE 'cell' STORE 'PT60m' DURATION 'duration' STORE '@TOKEN_READ@' 'TOKEN' STORE NOW 'now' STORE [ $TOKEN 'writeRequestCount' { 'cell' $cell 'Context' 'regionserver' } $now $duration ] FETCH // Remove resets false RESETS // Align ticks [ SWAP bucketizer.last $now 60 STU * 0 ] BUCKETIZE // Sum by hname [ SWAP [ 'hname' ] reducer.sum ] REDUCE FILLNEXT FILLPREVIOUS // Compute rates [ SWAP mapper.rate 1 0 0 ] MAP $context RESTORE %>
  22. 22. Extensibility
  23. 23. WarpScript Server Side Macros <% <’ This macro does such and such… @param xxx @param yyy ‘> DOC // Store the current context so we can create symbols freely SAVE ‘_context’ STORE // Insert your code here // Restore original context $_context RESTORE %> ‘macro’ STORE // Unit tests // Leave the macro on the stack $macro // Use via @path/to/macro in your scripts
  24. 24. WarpScript Extensions Import io.warp10.script.sdk.WarpScriptExtension; import io.warp10.script.NamedWarpScriptFunction; import io.warp10.script.WarpScriptException; import io.warp10.script.WarpScriptStack; import io.warp10.script.WarpScriptStackFunction; public class MyExtension extends WarpScriptExtension { private static Map<String,Object> functions = new HashMap<String,Object>(); private static class MyStackFunction extends NamedWarpScriptFunction implements WarpScriptStackFunction { @Override public Object apply(WarpScriptStack stack) throws WarpScriptException { …. return stack; } } static { functions.put("XXX", new MyStackFunction(“XXX”)); } @Override public Map<String, Object> getFunctions() { return functions; } }
  25. 25. CALLing external programs #!/usr/bin/env python -u import cPickle, sys, urllib, base64 # Output the maximum number of instances of this 'callable' to spawn print 10 # Loop, reading stdin, doing our stuff and outputing to stdout while True: try: line = sys.stdin.readline() line = line.strip() line = urllib.unquote(line.decode('utf-8')) # Remove Base64 encoding str = base64.b64decode(line) args = cPickle.loads(str) # Do out stuff output = …. # Output result (URL encoded UTF-8). print urllib.quote(output.encode('utf-8')) except Exception as err: print ' ' + urllib.quote(repr(err).encode('utf-8')) ... ->PICKLE ‘UTF-8’ BYTES-> ->B64 ‘path/to/file’ CALL B64-> PICKLE-> ....
  26. 26. Visualization
  27. 27. Quantum IDE
  28. 28. Quantum IDE
  29. 29. QuantumViz Web Component <!doctype html> <html> <head> <meta name="viewport" content="width=device-width, minimum-scale=1.0, initial-scale=1.0, user-scalable=yes"> <script src="https://api0.cityzendata.net/widgets/quantumviz/webcomponentsjs/webcomponents.js"></script> <link rel="import" href="https://api0.cityzendata.net/widgets/quantumviz/polymer/polymer.html"> <link rel="import" href="https://api0.cityzendata.net/widgets/quantumviz/warp10-quantumviz/warp10-quantumviz.html"> <body> <warp10-quantumviz width="500" height="400" show-axis="true" tooltip="true" line-width="2" reload="0" host="https://warp.cityzendata.net/" > NEWGTS 1 720 <% DUP 'i' STORE 10000000 * NaN NaN NaN $i TORADIANS COS ADDVALUE %> FOR [ SWAP ] 'gts' STORE [ { 'color' '#00d4ff' 'key' 'Sine' } ] 'params' STORE { 'interpolate' 'linear' } 'globalParams' STORE { 'gts' $gts 'params' $params 'globalParams' $globalParams } </warp10-quantumviz> </body> </html>
  30. 30. Grafana Integration
  31. 31. Timelion Integration
  32. 32. rocessing Integration 800 'width' STORE 800 'height' STORE 400.0 'maxspeed' STORE 40000.0 'maxalt' STORE 3.0 2.0 2.0 @orbit/heatmap/kernel/triangular 'kernel' STORE @orbit/heatmap/palette/classic 'palette' STORE 'TOKEN''token' STORE $width $height '2D' PGraphics 'MULTIPLY' PblendMode 'CENTER' PimageMode [ $token '~(ALT|CAS)' {} NOW -2000000 ] FETCH DUP 0 GET LASTTICK 'now' STORE [ SWAP bucketizer.last $now STU 0 ] BUCKETIZE // Create heatmap <% 7 GET LIST-> DROP 'CAS' STORE 'ALT' STORE <% $CAS ISNULL NOT $ALT ISNULL NOT && %> <% $kernel $CAS $maxspeed / $width * $ALT $maxalt / 1.0 SWAP - $height * Pimage %> IFT 0 NaN NaN NaN NULL %> MACROREDUCER 'GRAPHER' STORE [ SWAP [] $GRAPHER ] REDUCE DROP // Colorize Ppixels <% DROP Palpha $palette SWAP GET %> LMAP PupdatePixels Pencode Pdecode $width $height '2D' PGraphics // Do the grid PnoFill 0 0 $width 1 - $height 1 - Prect 2.0 PstrokeWeight 200.0 Pcolor Pstroke 250.0 $maxspeed / $width * DUP 0 SWAP $height Pline 0 10000 $maxalt / 1.0 SWAP - $height * DUP $width SWAP Pline SWAP 0 0 Pimage Pencode
  33. 33. QuantumImg Web Component <!doctype html> <html> <head> <link rel="stylesheet" href="//maxcdn.bootstrapcdn.com/bootstrap/3.3.5/css/bootstrap.min.css"> <link rel="stylesheet" href="//maxcdn.bootstrapcdn.com/font-awesome/4.3.0/css/font-awesome.min.css"> <script src="//cdnjs.cloudflare.com/ajax/libs/jquery/1.11.3/jquery.min.js"></script> <script src="//cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.3.5/js/bootstrap.min.js"></script> <script src="https://api0.cityzendata.net/widgets/quantumviz/webcomponentsjs/webcomponents.js"></script> <link rel="import" href="https://api0.cityzendata.net/widgets/quantumviz/polymer/polymer.html"> <link rel="import" href="https://api0.cityzendata.net/widgets/quantumviz/warp10-quantumviz/quantumviz-warpscript-image.html"> <body> <warp10-img width="300" height="300" reload="0" host="https://warp.cityzendata.net/"> 200 'width' CSTORE 200 'height' CSTORE $width $height '2D' Pgraphics Ppixels <% DROP DROP RAND 0xFFFFFFFF * TOLONG %> LMAP PupdatePixels Pencode </warp10-img> </body> </html>
  34. 34. Ok, what about Hadoop?
  35. 35. Dealing with time series data in Hadoop is difficult!
  36. 36. Most, if not all approaches do it wrong!
  37. 37. Either too narrow in focus... think econometric time series
  38. 38. ...or providing too little value... because moving average is simply a beginning
  39. 39. ...or limited to a specific tool think xxxRDD
  40. 40. Warp 10 brings the power of to
  41. 41. Warp10InputFormat ■ Read data stored in Warp 10 at millions of datapoints per second ■ Standard Hadoop InputFormat ■ Compatible with any tool relying on such an InputFormat ■ Compact representation of time series, lower memory footprint
  42. 42. Integration with ■ Enable the use of WarpScript code in the Spark DAG ■ Provide both WarpScriptFunction and WarpScriptFlatMapFunction ■ Manipulate RDD/DataSet/DataFrame elements on the WarpScript stack ■ Extend WarpScript to support custom types if needed ■ Load time series data from any source (Parquet, SQL, …)
  43. 43. DataFrame df = sqlc.read().parquet(...); RDD<Row> rdd = df.rdd(); JavaRDD<Row> jrdd = rdd.toJavaRDD(); JavaRDD<Row> out = jrdd.mapPartitions(new WarpScriptFlatMapFunction<Iterator<Row>,Row>("@ext-macro.mc2")); JavaPairRDD<Row, Iterable<Row>> grouped = out.groupBy(new WarpScriptFunction<Row, Row>("[ 0 1 ] SUBLIST ->SPARKROW")); JavaRDD<Row> merged = grouped.map(new WarpScriptFunction<Tuple2<Row,Iterable<Row>>, Row>("LIST-> DROP 0 GET [] SWAP <% SPARK-> 2 GET UNWRAP +! %> FOREACH MERGE WRAPRAW + 2 GET 1 ->LIST ->SPARKROW")); List<StructField> fields = new ArrayList<StructField>(); fields.add(DataTypes.createStructField("wrapper", DataTypes.BinaryType, false)); StructType st = new StructType(fields.toArray(new StructField[0])); DataFrame df2 = sqlc.createDataFrame(merged, st); df2.write().parquet("/path/to/output/parquetfile"); Integration with
  44. 44. Integration with ■ Enable the use of WarpScript code in Pig scripts ■ Provide a WarpScriptRun UDF ■ Manipulate Pig types (tuples, bags, …) on the WarpScript stack ■ Represent time series in a very compact form to speed up processing ■ Load time series data from any source
  45. 45. REGISTER warp10-pig-0.0.10-rc2.jar; SET warp.timeunits 'us'; DEFINE WarpScriptRun io.warp10.pig.WarpScriptRun(); GTS = LOAD '$input' USING PigStorage() AS (gts: chararray); -- Retain only the 'frequency' GTS and chunk them by 5 minutes FREQCHUNKS = FOREACH GTS GENERATE FLATTEN( WarpScriptRun('DUP UNWRAPEMPTY NAME "frequency" == <% UNWRAP 0 5 m 0 0 "chunkid" false CHUNK WRAP %> <% [] %> IFTE ->V ', gts)); -- Flatten the bag CHUNKS = FOREACH FREQCHUNKS GENERATE FLATTEN($0); -- Generate station id, chunk id, gts BYSTATIONCHUNK = FOREACH CHUNKS GENERATE FLATTEN( WarpScriptRun('DUP UNWRAP LABELS DUP "chunkid" GET SWAP "stationid" GET ', $0)) AS (stationid: chararray, chunkid: chararray, gts: chararray); -- Group by station id, chunk id STATIONCHUNKGROUP = GROUP BYSTATIONCHUNK BY (stationid, chunkid) PARALLEL 20; -- Merge the GTS to reconstruct the chunk and emit station id, chunk id, gts FULLCHUNKS = FOREACH STATIONCHUNKGROUP GENERATE FLATTEN( WarpScriptRun('V-> <% DROP 2 GET UNWRAP %> LMAP MERGE DUP LABELS SWAP WRAP SWAP DUP "chunkid" GET SWAP "stationid" GET ', BYSTATIONCHUNK)) AS (stationid: chararray, chunkid: chararray, gts: chararray); STORE FULLCHUNKS INTO ‘$output’ USING PigStorage(‘t’); Integration with
  46. 46. { 'type' 'spout' 'id' 'spout-0' 'output' { 'stream-0' [ 'field-2' 'field-1' ] } 'parallelism' 1 'every' 500 'debug' true 'macro' 0 'counter' STORE <% $counter 1 + 'counter' STORE 'NOW' 'https://host:port/api/v0/exec' REXEC 'now' STORE { 'stream-0' [ [ 'now' $now ] ] } %> } { 'type' 'bolt' 'id' 'bolt-0' 'parallelism' 2 'debug' true 'input' { 'spout-0' { 'stream-0' 'shuffle' } } 'output' { 'stream-1' [ 'outfield' ] } ‘macro' <% SNAPSHOT [ SWAP ] 'value' STORE $value 0 GET _storm.LOG { 'stream-1' [ $value ] } %> } Integration with stream processing engines
  47. 47. And also... ■ Integration with Flink ■ Integration with Zeppelin via a WarpScript interpreter ■ Warp 10 sink to push data to Warp 10 once it has been processed ■ Coherent approach in ad-hoc, batch, and streaming modes ■ Reduce amount of code needed to be written, focus on business problems
  48. 48. Open Source Distribution
  49. 49. Thank you! curl -O -L https://dl.bintray.com/cityzendata/generic/io/warp10/warp10/1.2.7/warp10-1.2.7.tar.gz tar zxpf warp10-1.2.7.tar.gz export JAVA_HOME=/path/to/java/home; cd warp10-1.2.7; ./bin/warp10-standalone.init start 3 steps to get you started with Warp 10 A set of resources to learn, ask and share @warp10io http://www.warp10.io/ http://groups.google.com/forum/#!forum/warp10-users https://github.com/cityzendata
  50. 50. contact @ cityzendata . com

×