Exporting SSKJ dictionary

Written by Matej Drolc

A few years ago, a friend came to me for advice about solving her problem on how to access her digital copy of Dictionary of Standard Slovenian Language from her Android phone. She told me that there is no Android version and that she was using TeamViewer to query the dictionary installed on her desktop machine. Recently, the same problem has been brought up again by another person so following is a hypothetical explanation of how I would have approached this problem.

Exporting the data

Looks like reverse-engineering the database structure is the task ahead of us, or is it? Well this is a dictionary, the data is not hidden, it is just served definition-by-definition and why not export it that way?

AutoIt to the rescue

Run("D:\Slovarji\ASP32\ASP32.EXE", "",  @SW_MAXIMIZE)
Local $file = FileOpen("SSKJ.txt", 1)
sleep(2000)
$title = WinGetTitle("")
Send("{TAB}")    ;focus record

For $i = 0 To 42090 Step 1
   Send("{TAB}")    ;focus record content
   Send("^a")       ;select all
   Send("^c")        ;copy
   FileWrite($file, ClipGet() & @CRLF)   
   Send("{TAB}")    ;focus search box
   Send("{TAB}")    ;focus record
   Send("{DOWN}")    ;move to next record
Next

FileClose($file)

This should produce a “SSKJ.txt” file where definitions are delimited with two CRLF’s.

Querying the data

A simple way to then query the exported data si to read the definitions into memory and serve them on demand. Here is a simple draft implementation in node.js.

var lookup = {};
var fs = require('fs');

fs.readFile('./SSKJ.txt', 'utf8', function(err, data) {

    if (err) throw err;

    data.split('\r\n\r\n').forEach( function(element, index, array){    

        var key = element.slice(element.indexOf("\t")+1,element.indexOf(" "));
        var value = element;
        lookup[key] = element;        
    });
});

var http = require('http');
http.createServer(function (req, res) {
    res.writeHead(200, { 'content-type': 'text/plain; charset=utf-8' });

    //url should be of format /sskj/keyword
    var decoded = decodeURIComponent(req.url);
    console.log(decoded);
    var match = /\/sskj\/(.*)/.exec(decoded);
    if (match && match.length > 1)
    {
        console.log(match[1]);
        console.log(lookup[match[1]]);
        res.end(lookup[match[1]]);
    }
    else
        res.end('no match or bad request');

}).listen(1337, "0.0.0.0");
console.log('server running at http://127.0.0.1:1337/ example url http://127.0.0.1:1337/sskj/trubadur');

Now it only needs a mobile app with a GUI for the user.

Also it would be useful to turn it into an offline mobile app where data is stored through SQLite and add word autocompletion but enough of hypothetical talk for today.