0% found this document useful (0 votes)
87 views

Integration of Oracle Intermedia Text With Dspace

This document discusses integrating Oracle Intermedia Text with the digital repository Dspace to enable searching of multimedia contents. InterMedia allows storing rich content like text, images, audio and video in an Oracle database. The authors propose integrating this feature with Dspace to generate themes for abstracts and allow retrieval based on those themes. This would provide an efficient solution for storing and searching large, diverse multimedia collections in a structured digital repository rather than unstructured web pages. However, creating metadata mappings and interfaces for integrating and retrieving visual content would require additional development.

Uploaded by

memo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
87 views

Integration of Oracle Intermedia Text With Dspace

This document discusses integrating Oracle Intermedia Text with the digital repository Dspace to enable searching of multimedia contents. InterMedia allows storing rich content like text, images, audio and video in an Oracle database. The authors propose integrating this feature with Dspace to generate themes for abstracts and allow retrieval based on those themes. This would provide an efficient solution for storing and searching large, diverse multimedia collections in a structured digital repository rather than unstructured web pages. However, creating metadata mappings and interfaces for integrating and retrieving visual content would require additional development.

Uploaded by

memo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

3ICT'18 1570483418

1  Integration of Oracle Intermedia Text with Dspace


2  
3  
4  
5   Hafiz Tayyab Rauf Waqas Haider Bangyal Jamil Ahmad
6   Department of Computer Science Department of Computer Science Department of Computer Science
7   University of Gujrat Iqra University Kohat University of Science and
8   Gujrat, Pakistan Islamabad, Pakistan Technology (KUST)
9   [email protected] [email protected] Kohat, Pakistan
10   [email protected]
11   Erum Afzal
12   Department of Computer Science
13   Fatima Jinnah Women University
14   Rawalpindi, Pakistan
[email protected]
15  
16   Abstract— Multimedia contents are most widely used on would not be able to improve retrieval of dynamic featured
17  the web. People from remote areas are also able to get benefits contents of multimedia. It has not become mature in retrieval
18  from the latest technology using the internet. Visual contents and used for a specific purpose. Due to the absence of
19  can convey the message more efficiently than of written effectiveness of the implementation of this technology, it is
20  document, especially when people belonging to remote areas not considered as useful in dealing with real-life queries on
21  where majority people have the less educated. In this paper dynamic contents of multimedia. It is doubted how and
22  new technique has been devised to search contents in the where this technique is being used efficiently [5].In this
23  digital repository on the theme of agriculture which would situation it is very difficult for information manager to
contain multimedia contents on agricultural reforms and
24  related to the usage the latest technologies in the field of deploy retrieval method for the efficient result. With the
25  agriculture. InterMedia is a collection of features integrated advent of latest technology multimedia contents have
26  into the Oracle database. It enables us to load rich contents in become digitized. These contents can be stored on web share
27  the database, manage the contents securely and deliver it for among others. People can create their own websites and
28  the use in an application. From which contents may consist manage and share the contents they possess [6]. But web
upon the rich contents like text, data, images, graphics, audio suffering has not been able to prove itself the best tool for
29  
preservation and contribution of digital visual contents. The
30  and video files and mostly use in web application. This feature query passed through web search engine for particular
is going to integrate with Dspace(digital repository) where it
31   contents may not return required contents due to the usage of
would generate theme on abstract and retrieval would be made
32  on one of those themes. inefficient storage/retrieval mechanism or the object was
33   replaced with another object. The efficient solution for the
34   Keywords— DSpace, Multimedia, Oracle database, Inter storage of huge and diverse multimedia contents is digital
35  media repositories. It provides efficient interactions among contents
36   producers and researchers [7].It can make searching more
I. INTRODUCTION efficient as it is not unstructured like the web pages. The
37  
38   Visual contents are getting more popular over the web popularity of web portals and domain-specific sites on the
39  recently. Visual contents can convey the message more web was the first step toward the need for data organization
40  efficiently than that of long descriptive written documents. and availability on the web.There is so many retrieving
41  Many people belonging to different professions are getting functions in oracle database management system for visual
benefits by accessing and manipulating these contents using contents. Oracle intermedia known as oracle text can be
42  
different techniques [1]. They are also trying to find out the phrased as absolute solution [8]. It provides searching on
43  new methods to locate the required contents from the huge
index and themes.
44  and diverse collection of these contents over the web.
45  Searching for multimedia content is a challenging task. Most The volume of research activity on multimedia contents
46  of the multimedia contents are not searchable [2].The retrieval procedures including policies remains to improve,
47  problems of accessing the visual contents are being pointed though important novel literature appears preferably
48  out. New dimensions for the problems have also been figured acquired, presenting research practice to research scholars
49  out. preferably than improving the status concerning the design.
Despite that, meaningful investigation difficulties endure to
50  
Multimedia contents can be consists upon different forms be discussed in literature, including conventional techniques
51  and formats with diverse dimensions in contrast with text-
like framing images to discriminate gadgets of advantage of
52  based documents and retrieval methods known as their knowledge (either alternatively, enhanced methods
53  Information Retrieval. It is quite challenging task to retrieve concerning feature extraction which does not depend upon
54  required data using IR. Dynamic contents of multimedia framing), current standards to user intercommunication by
55  cannot be extracted with the use of merely IR[3].It requires multimedia modes, furthermore useful ideas of designating
56  efficient techniques to index multimedia contents and individual interpretations of vision identity.
57  semantic-based graphical user interfaces for efficient
. Overall, there exist some obligations to connect these
60  retrieval from remotely stored huge multimedia databases[4].
semiotic breaks, returning an unusual degree of
61  Implementation of the index on multimedia contents has computerization into the methods of pointing and recovering
62  changed the directions of searching on the basis of automatic models over this kind of object or display represented.
63  feature detection of an image like shape, color and sound etc Through the extensive use of the web technologies, the
64  known as Content-Based Image Retrieval (CBIR). CBIR
65  

1
1  
2   demand for developing a flexible meaning based multimedia utilized for acquiring any regularity that is adjustable into the
3   text area becomes more significant. supervision of records with various kinds of content-based
4   text and information including adequate storage and retrieval
Digital-based research and innovation are not limited to of certain records. In this paper different type of metadata,
5   computer science discipline only it has expanded into multi-
6   discipline and is become mutual activity[9] it required live collections were tried to integrate into the single centralized
repository. The challenge of interoperability was resolved
7   consultations, contents and resource sharing and resultant
effectively.
8   into the development of institutional repositories. In order to
9   use the full potential of contents stored in institutional But this method required creating the new file for
10  repositories, there is a need to develop a knowledge metadata mapping and interface designed for integration and
11  repository to simulate contents from different domains and retrieval of visual content in the repository. There should be
12  areas. The work done so far has majorly focused on the an automated mapping mechanism of different metadata
13  retrieval methods about the information and contents in information. In [14] searching among the heterogeneous
visual contents. It is necessary to develop such an efficient repositories was purposed. By taking semantic features and
14  
method which would able to extract information using information relevance is calculated to retrieve the contents.
15  
context. We are going to design digital repository with the Ontology was made on the information of the information of
16  theme of agriculture. Storage and retrieval methods will base the user and semantic evolution of the participated contents.
17  on the context of the visual content. With the Being open source XML communication protocol
18   made accessibility it to other applications straightforward.
19   II. LITERATURE REVIEW The same technique can be used to link social networks and
20   In [10] provided a content-based digital library based learning management system
21  infrastructure which was designed that can be integrated with
22  existing repositories, query engines and user interfaces III. PROBLEM STATEMENT
23  system which have a content-based retrieval mechanism and Existing retrieval methods of visual contents do not prove
24  the common framework. In this paper problems of to be the efficient retrieval methods. There are currently two
25  integration of repositories were tackled using common web retrieval methods of visual contents retrieval: Indexed Based
26  standard. But the integration was limited to restricted test Information Retrieval and Content-Based Information
27  operations instead of enhancing test operations. Interaction Retrieval. The IR (index based information retrieval)
28  among different components of integrated systems was not methods lack as manual indexing is a very labor intensive
29  tackled similar type of components can communicate only. task. It takes too much time to answer one query and
Advance research on abstraction and navigation on relevance increases exponentially if multi-indexing methods have been
30  
of contents is required. The CSSEDMS [8] was the extended used. It is also not reliable technique as a means of visual
31  
project of CSSE. It was developed to manage the web content retrieval because it comes across many problems like
32  content by focusing on usability and maintainability factor of different keywords may be assigned to the same contents,
33  the information retrieval systems. According to author indices developed in particular language would not be able to
34  evaluation of usability is very important in order to find out answer the query other than this language, misspellings of
35  the barriers in its way. The data-centric procedural design keywords, change in vocabulary are also one of the barriers
36  was purposed for simple web applications. In the system, in the process of retrieval.
37  data manipulation was done on existing CSSE database.
One more hurdle in is the reason that the same visual
38  
The procedure including fetching data sets from existing contents can be perceived and rated in a different way from
39  database and storing back to it after modification. If there the various point of view form different individuals and older
40  were reporting functionality and the usability of the system digitized contents were stored with minimal descriptive. The
41  would be predictable. The full-text approach of SQL server former method has been stated in literature to be divided into
42  on title and Abstract of data contents was used. The three levels of generalization. In level –I retrieval is made on
43  functionality can be applied to all textual contents of the file the primitive features inside the picture/video ( color, shape,
44  in order to apply the filter on binary files. and texture), level- II is supposed to process retrieval basing
45   logical inferences like flower image and level-III try to make
There are built-in functions to implements IFilter the retrieval basing upon the signification of the environment
46  
interface and need to develop such filters for unstructured like the beach scene. The drawback of this method is that
47  and visual content data. In[11] storage and retrieval method
48  of recorded lectures were purposed using Java API and Level-1 is only applicable and has been used for retrieval
49  reverse engineering. By using eight classes system was able purposes. Semantic parameters like the form of item exist in
50  to extract metadata and properties of recorded files and the picture/video are not tackled. It is significant problem
and needs to be tackled.
51  indices were made on it. The result response of the query is
52  very quick and the result is displayed during query DSpace stocks collecting gadgets and the description of
53  processing as auto query completion methods have been data in preprocessed relational database records, providing
54  implemented efficiently. Improvement in the retrieval of effective programmatic control and access. Furthermore,
55  recorded contents can be more efficient using optical DSpace preserves two types of a directory to obtain
56  character recognition[12]. assistance services. The initial type comprises an organized
57   The textual contents of a recorded file can be retrieved directory of Items (by date title) or writers about Articles in
60  which would be made further processing easy. the archive. This control tools such as special Web user
In interface to recover and display on the user directed outlines
61  [13]represented a system of content management of different
of things to scan within. Certain records remain prepared into
62  type. It was scalable and very efficient in searching contents the database.h
63  from the storage. As the design of specific MILOS Content
64  Management System including individual explications
65  

2
1  
2   The second type of directory includes fast look-up
3   directory to return the user’s keyword exploration questions.
4   Immediately, some part about the Dublin Core tracks is
5   tabulated. Lucene maintains full-text recording but DSpace is
6   unable to utilize that feature. But with the integration of
Oracle text full-text indexing would be possible.
7  
8   IV. PROPOSED SOLUTION
9  
The proposed digital Multimedia Repository holds a
10  unique, combined characteristic increase the database
11  capable of storing, manipulating, and recovering a visual
12  information, audio information, video information, and
13  through promoting Web technologies to multimedia
14  information using theme base retrieval methods. Attempts to
15  retrieve images by the exclusive use of theme purposed.
16  Firstly, at the time of storage, it is required to store abstract
17  of a particular visual content. Then using intertext
18  functionality of the Oracle database management system Fig. 2. System Architecture of DSpace
19  theme would be captured by creating theme index on a
20  particular column of data. Finally, retrieval would be made B. Business layer
21  on the themes of the contents this approach is likely to be The business logic layer distributes among the contents,
22  more effective than by any existing method of traditional users (e-people), support, and control-flow.
information-based retrieval as well as simple to implement
23   C. Application layer
24   The architecture defines the framework for following
The application layer comprises elements that correspond
25  modules
by the system world of the particular DSpace connection,
26   a) Storage layer Database Oracle such as, the Web user design and the local Archives force
27   rules to data description collecting co-operation. The DSpace
b) Middle ware: Dspace Business logic layer
28   Web UI is the largest and most-used component in the
29   c) Browsing Application layer
application layer. Two versions: first one is JSPUI having
30   . Built on Java Servlet and JavaServer Page technology,
31   second is XMLUI (Manakin): Built on XML and Cocoon
32   technology
33  
Text indexing is used in information retrieval systems as
34   it is the very powerful tool. This tool is proved very effective
35   when the domain of contents stored on the web is very large
36   and become very difficult for the end user to traverse through
37   a huge amount of data stored on the web. Oracle Text the
38   integrated feature renovates this huge data into the precise
39   valuable form. It is the full knowledge base and can be used
40   effectively while indexing the contents. It does not only use
41   for exact text matching but also examine the context of the
42   text. With the use of this feature, our application would able
43   to examine the context of the contents and will categorize
44   these contents on themes instead of explicit terms or
45   expressions.
46  Fig. 1. Dspace Architecture V. DISCUSSION
47  
We have implemented and show the filter results of the
48   DSpace an institutional repository is open source
intermedia text. The clear view for storage table is given in
49  infrastructure it takes data in any form then save it in the Fig .3.The experimental results reveals the superiority of the
50  database and also index the stored contents to make retrieval proposed solution.
51  process efficient. There are three roles of it first one is to
52  Expedite the acquisition including ingest of elements, as
53  well as data description regarding the elements further
54  accommodate convenient path through special elements with
indexing, exploring and to implement long-term
55  
conservation of the elements.
56  
57   A.Storage layer
60   The storage layer is used for physical storage contents
61  and metadata information. Oracle relational database
62  management system is used to store contents, metadata
63  information about the user's groups, people and verification,
64  and status of the workflow that is running.
65  

3
1  
2   4) restab => ‘mytheme’
3   5) );
4   6) end;
5   7) /
6   8) PL/SQL Procedure successfully completed.
7  
E. Visualization of themes
8  
9   SQL> select theme, weight from mytheme ordered by
10   wright desc;
11  
TABLE I. THEME VISUALIZATION WITH WEIGHTS
12  
13   Theme Weight
14   occurrences 12
15   search engine 12
16  
internet
17   11
18   result 11
19   returns 11
20  Fig. 3. Storage Table view
21   databases 11
Following example shows the clear view of implement of
22  intermedia text. searches 10
23  
favouritism
24   A. Creation of table 6
25   1) SQL> create table DSpace type 5
26   2) (id number Primary Key, plethora 4
27   3) Abstract varchar2 (5000)
frequency
28   4) ) 3
29   5) / words 3
30   6) Table created.
31   7) SQL> create table mythemes
32   8) (Qry_id number Primary Key,
33   9) Theme varchar2 (1500) First of all two tables DSpace and mythemes created one
34   10) Weightage number for data storage and second for the storage of themes. On
35   11) ) second step data is entered in Dspace table. In third step
36   context type index created on Abstract field of Dspace table.
12) / In fourth step PL/SQL procedure applied to implement the
37  
13) Table created. index. It will take table DSpace on which index “my_idx”
38   has been created then it will look up for the key value that is
39   B. Insertion data into table 1 and will retrieve required date indexed data and parse it to
40   1) SQL> Insert into DSpace (id, Abstract the analyzer. The analyzer will create the theme using
41   2) (01, backing knowledge base and store the result into theme table.
42   3) ‘Go to your favourite web search engine, type in a Now retrieval is made on themes or the keywords which do
43  constantly transpiring information on the internet such as not exist in documents.
44  “database”, and wait for a plethora of search result to
45  reflect..’ VI. CONCLUSION
46   4) ) In this paper interMedia Text feature has been purposed
47   5) / for text indexing and context generation on DSpace contents.
48   Although Oracle relational database management system is
6) 1 Row created. used as backend storage text indexing has not been used. The
49  
7) SQL> commit; creation of themes is based on linguistic analysis instead of
50  
8) Commit complete word matching, term counting, weight or statistical analysis.
51  
. Intermedia Text acts like knowledge base and contexts/
52  
themes are generated from the domain Knowledgebase.
53   C. IndexCreation
Currently in this paper purposed method is the integration of
54   1) SQL> Create index my_idx on DSpace (Abstract) text indexing on DSpace contents. The themes are generated
55  indextype is ctxsys.context; by relying on existing domain of knowledge base. In future,
56   2) Index created the more for enhancing intermedia functionality can be
57   implemented by defining a specific domain or knowledge
60   D. PL/SQL Procedure base like medical dictionary or agricultural domain
61   1) SQL> Begin knowledge..
62   2) Ctx_doc.theme(index_name => ‘my_idx’,
63   3) textkey => ‘1’,
64  
65  

4
1  
2   REFERENCES
3  
4   [1] V. N. Gudivada and V. V. Raghavan, "Content based image retrieval
5   systems," Computer, vol. 28, no. 9, pp. 18-22, 1995.
6   [2] T. H. Painter, J. Dozier, D. A. Roberts, R. E. Davis, and R. O. Green,
7   "Retrieval of subpixel snow-covered area and grain size from
8   imaging spectrometer data," Remote Sensing of Environment, vol.
85, no. 1, pp. 64-77, 2003.
9  
10   [3] Q. Luan et al., "Natural image colorization," in Proceedings of the
18th Eurographics conference on Rendering Techniques, 2007, pp.
11   309-320.
12  
[4] N. Sebe and Q. Tian, "Personalized multimedia retrieval: the new
13   trend?," in Proceedings of the international workshop on Workshop
14   on multimedia information retrieval, 2007, pp. 299-306.
15   [5] J. Eakins and M. Graham, "Content-based image retrieval," 1999.
16  
[6] A. Sutcliffe, M. Hare, A. Doubleday, and M. Ryan, "Empirical
17   studies in multimedia information retrieval," in intelligent
18   multimedia information retrieval, 997, pp. 449-472.
19   [7] N. Adam and Y. Yesha, "Strategic directions in electronic commerce
20   and digital libraries: towards a digital agora," ACM Computing
21   Surveys (CSUR), vol. 28, no. 4, pp. 818-835, 1996.
22   [8] C. Wang, "CSSE Document Management System: Implementation
23   and Usability Evaluation (Doctoral dissertation)".
24   [9] P. Muthukumar, P. Suresh, S. S. Punithavathani, and J. N. Begum,
25   "A realistic approach for the deployment of national knowledge
26   repositories by leveraging ETL tools," in Recent Trends In
Information Technology (ICRTIT),International Conference on,
27   2012, pp. 542-547.
28  
[10] R. Berndt, H. Krottmaier, S. Havemann, and T. Schrec, "The
29   PROBADO-framework: Content-based queries for non-textual
30   documents," in ELPUB2009 Conference on Electronic Publishing,
31   2009.
32   [11] C. Hermann, "Improving document retrieval using special
33   characteristics of lecture recording documents," in Proceedings of
the Third Symposium on Information and Communication
34   Technology, 2012, pp. 250-259.
35  
[12] P. Ziewer, "Navigational indices and full text search by automated
36   analyses of screen recorded data," in E-Learn: World Conference on
37   E-Learning in Corporate, Government, Healthcare, and Higher
38   Education, pp. 3055-3062.
39   [13] G. Amato, C. Gennaro, F. Rabitti, and P. Savino, "Milos: A
40   multimedia content management system for digital library
applications," in International Conference on Theory and Practice of
41   Digital Libraries, pp. 14-25.
42  
43   [14] R. L. R. Campos, R. L. Comarella, and R. Silveira, "Model of
Recommendation System for for Indexing and Retrieving the
44   Learning Object based on Multiagent System," Respuestas, vol. 17,
45   no. 2, pp. 21-30, 2012.
46  
47  
48  
49  
50  
51  
52  
53  
54  
55  
56  
57  
60  
61  
62  
63  
64  
65  

You might also like