No me indexes que me cacheo

Index
Cache
Index
Cache
Other Search
Engines
Pub doc
Index &
Cache

Cache
http://www.elladodelmal.com/2013/08/una-anecdota-con-google-archive-y-la.html

Primary Index & Secondary Index
http://www.elladodelmal.com/2014/07/cuantas-urls-se-pueden-extraer-con.html

What is Robots.txt for?
• Evita la indexación de contenidos
de las URLs protegidas
• Por tanto, no se realiza spidering
• No evita que se indexen las
URLS

Robots.txt Security Issues
B14Ck S30 Pwn1nage!

Robots.txt “leakeage” + Archive.org

Robots.txt “leakeage” + Google
http://www.elladodelmal.com/2013/09/buscando-en-robotstxt-lo-que-esta.html

Robots.txt “leakeage”+ Google+
Directory Listing
http://www.elladodelmal.com/2013/09/buscando-en-robotstxt-lo-que-esta.html

Robots.txt “leakeage” + Google + IIS
ShortName b
http://www.elladodelmal.com/2013/10/un-bug-de-iis-short-name-en-el-windows.html

Robots.txt “leakeage” + Google + IIS
ShortName + FOCA
http://www.elladodelmal.com/2012/07/listado-de-ficheros-en-iis-7-utilizando.html

Indexing the Robots.txt:
Blogger preview + Cache
http://www.elladodelmal.com/2012/08/minority-report-pre-visualizando-el.html

Indexing the Robots.txt:
WordPress preview + Cache
http://www.elladodelmal.com/2014/07/wordpress-ten-cuidado-con-el-cacheo-de.html

“GmailGate” with Robots.txt:
Octubre de 2013
http://www.elladodelmal.com/2013/10/79400-urls-de-gmail-indexadas-en-google.html

Abril de 2014
http://www.elladodelmal.com/2014/05/gmail-borraste-en-google-pero-te-quedan.html

Julio de 2014
http://www.elladodelmal.com/2014/07/googe-si-usa-bing-para-borrar-las-urls.html

“Facebookgate” with Robots.txt
http://www.elladodelmal.com/2013/09/facebook-tiene-problemas-con-la.html

“WhatsAppGate” with Robots.txt
http://www.elladodelmal.com/2013/09/problemas-de-privacidad-de-whatsapp-con.html

Indexing the robots.txt + XSS = XSS
Google-Persistentes
http://es.slideshare.net/chemai64/xss-google-persistentes

Robots.txt
• Previene que se indexe a partir de las
rutas puestas.
• Evita que se guarde contenido en el
índice de Google/Bing/Otros
• No evita que la URL, el título, y las
keywords del enlace se indexen.
• Puede ser un leak de información en
ataques dirigidos y en ataques de
dorking.
• No evita la indexación en el pasado.

“Evernotegate” with Robots.txt
http://www.elladodelmal.com/2014/08/evernote-no-quiere-hacer-nada-cuida-tus.html

Not cache, but it is in the index

Evernote, Index & Cache

El “viagrate” del Albacete Balompié

How to manage the relationship?
http://www.slideshare.net/chemai64/black-seov3

• Evitar rutas con contenido mixto
(público/privado)
• Evitar contenido no enlazado en rutas
públicas
• Evitar rutas privadas conocidas (/etc/ /users/)
• Evitar rutas privadas explícitas
• Evitar configuraciones privadas automáticas
• Evitar el uso de rutas privadas a fichero
• Aplicar la misma configuración para todas las
arañas de todos los buscadores de Internet
• Proteger las rutas privadas con listas de
control de acceso si es posible
http://www.slideshare.net/chemai64/black-seov3

(Google)
https://developers.google.com/webmasters/control-crawl-index/docs/robots_meta_tag

HTML Meta Tags

X-Robots-Tag HTTP header

Google WebMaster Tools
https://www.google.com/webmasters/tools/home?hl=es

Faast: Persistent Pentesting
• BlackSEO
– Cheap Viagra
– Cheap Software
– Etc…
• Robots.txt fingerprinting
– SW versions
• Robots.txt leakeage
– Testing all forbiden
URLs
• Robots.txt indexation
– Searching forbidden
URLS in Google/Bing
https://www.elevenpaths.com/technology/faast/index.html

Chema Alonso
chema@11Paths.com
http://www.elladodelmal.com
@chemaalonso
¿Preguntas?

No me indexes que me cacheo

Más contenido relacionado

Destacado

Similar a No me indexes que me cacheo

Más de Chema Alonso

Último

No me indexes que me cacheo