Parsing Chrome Artifacts with Python! Part 2
Continuing on with my riveting series on how to parse the Chrome SQLite files with Python, let's dig a little deeper into SQL portion by taking a look the History database.
History Database
Here are the tables and associated fields:
- downloads: id,current_path,target_path, start_time, received_bytes, total_bytes, state, danger_type, interrupt_reason, end_time, opened, referrer, by_ext_id, by_ext_name, etag, last_modified, mime_type, original_mime_type
- downloads_url_chains: id, chain_index, url
- keyword_search_terms: keyword_id, url_id, lower_term, term
- meta: key,value
- segment_usage: id, segment_id, time_slot, visit_count
- segments: id, name, url_id
- urls: id, url, title, visit_count, typed_count, last_visit_time, hidden, favicon_id
- visits: id, url, visit_time, from_visit, transition, segment_id, visit_duration
- visit_source: id, source
ses
Some of these tables are useful on there own but much more information can be derived from querying more than one at a time. For example, the visits table stores information about the when and how often you visit a URL. To reduce redundant information in the database, instead of storing the URL again in the visits table, the visits table contains a pointer to the index of the URL in the url table.
import sqlite3, datetime
def fixDate(timestamp):
#Chrome stores timestamps in the number of microseconds since Jan 1 1601.
#To convert, we create a datetime object for Jan 1 1601...
epoch_start = datetime.datetime(1601,1,1)
#create an object for the number of microseconds in the timestamp
delta = datetime.timedelta(microseconds=int(timestamp))
#and return the sum of the two.
return epoch_start + delta
#Here is a simple example of how to query two tables and produce one meaningful output
selectStatement = 'SELECT visits.visit_time, urls.url, urls.title FROM visits, urls WHERE visits.url=urls.id;'
#These can be a bit tricky so lets break it down...
#FROM visits, urls : From the "visits" and "urls" tables
#WHERE visits.url=urls.id : If the value in the visits.url field matches an id value in the url table...
#SELECT visits.visit_time, urls.url, urls.title : Return the visit_time from the visits table, the URL and Title from the url table
historyFile = 'C:\\Users\\User\\AppData\\Local\\Google\\Chrome\\User Data\\Default\\History'
c = sqlite3.connect(historyFile)
for row in c.execute(selectStatement):
print str(fixDate(row[0])),"\n\tURL: ", str(row[1]), "\n\tTITLE: ", row[2].encode("utf-8")
'''
#########Sample Output#########
2015-07-22 15:46:47.718275
URL: http://www.decalage.info/python/olefileio
TITLE: olefile - a Python module to read/write MS OLE2 files | Decalage
2015-07-22 15:46:53.788847
URL: https://bitbucket.org/decalage/olefileio_pl/wiki/Install
TITLE: decalage / OleFileIO_PL / wiki / Install ΓÇö Bitbucket
2015-07-22 15:49:52.388096
URL: http://www.decalage.info/python/olefileio
TITLE: olefile - a Python module to read/write MS OLE2 files | Decalage
2015-07-22 15:50:15.158813
URL: http://blog.didierstevens.com/programs/oledump-py/
TITLE: oledump.py | Didier Stevens
2015-07-22 15:50:42.258023
URL: http://www.kahusecurity.com/2013/dissecting-a-malicious-word-document/
TITLE: Dissecting a Malicious Word Document | Kahu Security
'''
/* Look for all search terms and the site they went searched from */
SELECT keyword_search_terms.term, urls.url
FROM urls, keyword_search_terms
WHERE keyword_search_terms.url_id=urls.id;
/* Look for search term matching a string. Here we are looking for "install" */
SELECT keyword_search_terms.term, urls.url
FROM urls, keyword_search_terms
WHERE keyword_search_terms.url_id=urls.id AND keyword_search_terms.term LIKE "%install%";
"Jon, I am a seasoned Information Security professional and I am taking the time to read your silly blog. Don't bore me."Imaginary Reader, I respect your candor and thirst for knowledge. Fine! Challenge accepted.
Here is a query that will query the visits table and the urls table for the URLS for the visits and their referring visits URL, assuming both are available.
SELECT urls.url AS "NAVIGATED_FROM", VISITSESSIONS.URL AS "NAVIGATED_TO"
FROM (
SELECT visits.id AS ID,visits.from_visit AS CHILDSESSION, visits.url AS URLID, urls.url AS URL
FROM visits, urls
WHERE visits.url = urls.id) AS VISITSESSIONS, visits, urls
WHERE visits.id = VISITSESSIONS.Parent AND visits.url = urls.id
/*
####SAMPLE OUTPUT#####
http://www.mozilla.org/ https://www.mozilla.org/
https://www.mozilla.org/ https://www.mozilla.org/en-US/
https://www.mozilla.org/en-US/ https://www.mozilla.org/firefox/new/?scene=2#download-fx
https://www.mozilla.org/firefox/new/?scene=2#download-fx https://www.mozilla.org/en-US/firefox/new/?scene=2#download-fx
https://github.com/williballenthin https://github.com/williballenthin/python-registry
*/
SELECT visits.visit_time, urls.url
FROM visits, urls
WHERE visits.url = urls.id
ORDER BY visits.visit_time
/*
########SAMPLE########
13082502632412243,https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8
13082502636950357,https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=install+firefox
13082502642336239,http://www.mozilla.org/
13082502642336239,https://www.mozilla.org/
13082502642336239,https://www.mozilla.org/en-US/
13082502855571719,https://www.mozilla.org/en-US/
13082502869473924,https://www.mozilla.org/firefox/new/?scene=2#download-fx
13082502869473924,https://www.mozilla.org/en-US/firefox/new/?scene=2#download-fx
13082503576546744,https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8
13082503584223125,https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=wireshark%20filter%20flash
13082503593833681,https://ask.wireshark.org/questions/8650/capturing-http-requests-of-a-flash-application
13082567906034617,https://ask.wireshark.org/questions/8650/capturing-http-requests-of-a-flash-application
13084943745780659,https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8
13084943750422298,https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=analyzeMFT
13084943753991890,https://github.com/dkovar/analyzeMFT
13084944413570153,https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8
13084944422840343,https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=parse%20windows%20registry%20python
*/