diplomsko delo
Boštjan Ferlič (Author), Milan Ojsteršek (Mentor)

Abstract

Spletni pajki so programi za prenos spletnih strani iz interneta, njihovo obdelavo in shranjevanje. V teoretičnem delu diplomske naloge je opisano splošno delovanje spletnih pajkov in problemi, s katerimi se srečujemo pri njihovi implementaciji. Opisano je tudi delovanje spletnih pajkov dveh najbolj razširjenih spletnih iskalnikov ter trije odprtokodni spletni pajki. V praktičnem delu diplomskega dela smo implementirali porazdeljen spletni pajek v dveh ločenih aplikacijah. Ena aplikacija je grafični vmesnik za urejanje vseh nastavitev, druga pa konzolna aplikacija, ki prenaša strani in jih shranjuje v podatkovno bazo.

Keywords

spletni pajki;porazdeljeni računalniški sistemi;detekcija podobnih vsebin;diplomske naloge;

Data

Language: Slovenian
Year of publishing:
Typology: 2.11 - Undergraduate Thesis
Organization: UM FERI - Faculty of Electrical Engineering and Computer Science
Publisher: B. Ferlič
UDC: 004.774.6(043.2)
COBISS: 20193046 Link will open in a new window
Views: 1033
Downloads: 76
Average score: 0 (0 votes)
Metadata: JSON JSON-RDF JSON-LD TURTLE N-TRIPLES XML RDFA MICRODATA DC-XML DC-RDF RDF

Other data

Secondary language: English
Secondary title: WEB SPIDER IMPLEMENTATION FOR PLAGIARISM DETECTION SOFTWARE
Secondary abstract: Web crawlers are applications for transferring, processing and storing web sites from the internet. Theoretical part of the thesis describes the general operation of web crawlers and the problems we are facing during their implementation. In this part, the operation of web crawlers of the two most used web search engines and three open source web crawlers, are also described. In practical part of the thesis, we have implemented a distributed web crawler in two separate applications. One application is a graphical user interface for configuration, and the other is a console application, that downloads web sites and stores them into database.
Secondary keywords: web crawler;distributed computer system;detection software;
URN: URN:SI:UM:
Type (COBISS): Undergraduate thesis
Thesis comment: Univ. v Mariboru, Fak. za elektrotehniko, računalništvo in informatiko, Računalništvo in informatika
Pages: VII, 42 f.
ID: 9164814