Извлечение и анализ данных о судопроизводстве в г. Томске с помощью технологий OLAP И Data Mining
| Parent link: | Технологии Microsoft в теории и практике программирования: сборник трудов XII Всероссийской научно-практической конференции студентов, аспирантов и молодых ученых, г.Томск, 25-26 марта 2015 г./ Национальный исследовательский Томский политехнический университет (ТПУ), Институт кибернетики ; ред. кол. А. В. Лиепиньш [и др.]. [С. 105-106].— , 2015 |
|---|---|
| Main Author: | |
| Corporate Author: | |
| Other Authors: | , , |
| Summary: | Заглавие с титульного листа. The article is intended to analyze various data obtained from websites of regional and district Tomsk courts via advanced analytic technologies such as OLAP and Data Mining. The process of comparing structure web pages and parsing HTML pages using PHP and C# is considered in details. Near-duplicates and shingling, as well as regular expressions and Levenshtein distance stand for analyzing and comparing texts, sentences and words. Due to these algorithms, the issue relating to extraction of necessary units can be sorted out effectively and quite accurately. |
| Published: |
2015
|
| Series: | Геоинформационные системы и технологии |
| Subjects: | |
| Online Access: | http://earchive.tpu.ru/handle/11683/23823 http://www.lib.tpu.ru/fulltext/c/2015/C28/045.pdf |
| Format: | Electronic Book Chapter |
| KOHA link: | https://koha.lib.tpu.ru/cgi-bin/koha/opac-detail.pl?biblionumber=614554 |
| Summary: | Заглавие с титульного листа. The article is intended to analyze various data obtained from websites of regional and district Tomsk courts via advanced analytic technologies such as OLAP and Data Mining. The process of comparing structure web pages and parsing HTML pages using PHP and C# is considered in details. Near-duplicates and shingling, as well as regular expressions and Levenshtein distance stand for analyzing and comparing texts, sentences and words. Due to these algorithms, the issue relating to extraction of necessary units can be sorted out effectively and quite accurately. |
|---|