2. soluzioni
• Regular expression (can be slow and memory hungry)
• Lucene (full-text search engine library)
• Solr (standalone full-text search server )
• SolrJ (java client per solr)
3. Regular expression
• (cos’è) una sequenza di simboli (quindi una
stringa) che identifica un insieme di stringhe
• (che fa) definisce una funzione che prende in
ingresso una stringa, e restituisce in uscita un
valore del tipo sì/no, a seconda che la stringa
segua o meno un certo pattern.
4. Regular expression (esempio)
1. Pattern p = Pattern.compile("eur*usd");
2. Matcher m = p.matcher(
3. “In quel ramo del lago di
eUr&uSd”).toLowerCase()
4. );
5. If(m.find()) {
//trovato! Ma dove nella stringa?
6. }
5. Lucene
• Lucene is a high-performance, full-featured text
search engine library written entirely in Java. It
is a technology suitable for nearly any
application that requires full-text search,
especially cross-platform.
• Apache Software Foundation
• Stable release 4.3.0 / May 6, 2013
• Development status Active
6. Lucene (esempio)
• Analyzer analyzer = null;
• Directory index = null;
• IndexWriterConfig config = null;
• IndexWriter w = null;
• //analyzer = new StandardAnalyzer(Version.LUCENE_43);
• analyzer = new KeywordAnalyzer();
• index = new RAMDirectory();
• config = new IndexWriterConfig(Version.LUCENE_43, analyzer);
• w = new IndexWriter(index, config);
7. Lucene (esempio 2)
1. private void addDoc(long time, String value, String flag) throws Exception
{
2. Document doc = new Document();
3. doc.add(new StringField("time", String.valueOf(time), Field.Store.YES));
4. doc.add(new StringField("value", value, Field.Store.YES));
5. doc.add(new StringField("flag", flag, Field.Store.YES));
6. w.addDocument(doc);
7. }
à w.commit(); //da eseguire alla fine del batch
9. Solr
• Solr is written in Java and runs as a standalone full-text search
server within a servlet container such as Jetty.
• Solr uses the Lucene Java search library at its core for full-text
indexing and search, and has REST-like HTTP/XML and JSON APIs
that make it easy to use from virtually any programming language.
Solr's powerful external configuration allows it to be tailored to
almost any type of application without Java coding, and it has an
extensive plugin architecture when more advanced customization is
required.
• Apache Software Foundation
• Stable release 4.3.0 / May 6, 2013
• Development status Active
10. SolrJ
• SolrJ is a java client to access Solr.
• It offers a java interface to add, update,
and query the solr index.
• Last version: 1.4.X