Interfax de Documentaciòn
Interfax de Documentaciòn
Class MyFun
1.1. DocStandardization
Statement
MyFun
Return Value
Parameter
Explanation
Example
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Runtime.InteropServices;
using KeywordExtractionAPI;
namespace KEDLL
{
class Program
{
static void Main(string[] args)
{
string TheDoc = "When on board H.M.S. Beagle, as naturalist, \n\nI was
much struck with certain facts in the distribution of the
organic beings inhabiting South America, and in the geological relations of the
present to the past inhabitants of that continent. ";
string TheDoc_Standardization = "";
TheDoc_Standardization =
KeywordExtractionAPI.MyFun.DocStandardization(TheDoc);
Console.WriteLine("TheDoc:\n" + TheDoc);
Console.WriteLine("\nTheDoc_Standardization:\n" +
TheDoc_Standardization);
Console.ReadKey();
}
}
}
1.2. RemoveStop
Statement
MyFun
Return Value
If successful, return the document which is removed the stop words. If execution
fails, return "ERROR" for the beginning of the wrong reasons.
Parameter
Explanation
RemoveStop. Thisfunction is used to remove stop words in the document, the process
by reading the list of stop words, according to the list of words, the stop words
in the document is removed. In addition, this step is optional, remove stop words
may improve the accuracy of keyword extraction, or may not be affected.
Cause of error: No stop words in the file.
Example
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Runtime.InteropServices;
using KeywordExtractionAPI;
namespace KEDLL
{
class Program
{
static void Main(string[] args)
{
string TheDoc = "origin of species introduction when on board hms beagle
as naturalist i was much struck with certain facts in the distribution of the organic
beings inhabiting south america and in the geological relations of the present to
the past inhabitants of that continent";
string TheDoc_RemoveStop = "";
TheDoc_RemoveStop = KeywordExtractionAPI.MyFun.RemoveStop(TheDoc);
Console.WriteLine("TheDoc:\n" + TheDoc);
Console.WriteLine("\nTheDoc_RemoveStop:\n" + TheDoc_RemoveStop);
Console.ReadKey();
}
}
}
1.3. StatisticsWords
Count the word frequency、word loction and word distance of the input document
Statement
MyFun
Return Value
Return WORDSFRE class array. In this array ,each node WORDSFRE class is saved in
a word frequency, location and distance between two successive words.
Parameter
TheDoc:The document which the word segment has been carried out
Explanation
Example
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Runtime.InteropServices;
using KeywordExtractionAPI;
namespace KEDLL
{
class Program
{
static void Main(string[] args)
{
string TheDoc = "origin of species introduction when on board hms beagle
as naturalist i was much struck with certain facts in the distribution of the organic
beings inhabiting south america and in the geological relations of the present to
the past inhabitants of that continent";
KeywordExtractionAPI.WORDSFRE[] wordsfre;
wordsfre = KeywordExtractionAPI.MyFun.StatisticsWords(TheDoc);
Console.WriteLine("TheDoc:\n" + TheDoc);
Console.WriteLine("\nwordsfre:");
foreach (KeywordExtractionAPI.WORDSFRE wf in wordsfre)
{
Console.WriteLine(wf.Word + "\t" + wf.Frequency);
}
Console.ReadKey();
}
}
}
1.4. QuickSort
Statement
MyFun
Return Value
Parameter
array:The array which type is WORDSFRE, and save the entropy difference
Explanation
QuickSort. This function is used to sort the array, the higher entropy difference
between two words will be in the front of the array.
Example
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Runtime.InteropServices;
using KeywordExtractionAPI;
namespace KEDLL
{
class Program
{
static void Main(string[] args)
{
string TheDoc = "origin species introduction board hms beagle
naturalist struck distribution organic inhabiting south america geological
relations past inhabitants continent seen latter chapters volume throw light origin
species mystery mysteries called philosophers return home occurred 1837 question
patiently accumulating reflecting sorts possibly bearing five allowed speculate
subject drew short notes enlarged 1844 sketch conclusions probable period day
steadily pursued object hope excused entering personal details hasty coming decision
1859 nearly finished complete health strong urged publish abstract especially
induced wallace studying natural history malay archipelago arrived exactly
conclusions origin species 1858 sent memoir subject request forward sir charles
lyell sent linnean society published third volume journal society sir lyell dr hooker
latter read sketch 1844 honoured thinking advisable publish wallaces excellent
memoir brief extracts manuscripts abstract publish necessarily imperfect references
authorities statements trust reader reposing confidence accuracy doubt errors crept
hope cautious trusting authorities conclusions arrived illustration hope suffice
feel sensible necessity hereafter publishing detail references conclusions grounded
hope future am aware scarcely single discussed volume adduced apparently leading
conclusions directly opposite arrived fair result obtained stating balancing
arguments question impossible regret space prevents satisfaction acknowledging
generous assistance received naturalists personally unknown opportunity pass
expressing deep obligations dr hooker fifteen aided stores knowledge excellent
judgment considering origin species conceivable naturalist reflecting mutual
affinities organic embryological relations geographical distribution geological
succession conclusion species independently created descended varieties species
nevertheless conclusion founded unsatisfactory shown innumerable species
inhabiting world modified acquire perfection structure coadaptation justly excites
admiration naturalists continually refer external conditions ";
KeywordExtractionAPI.WORDSFRE[] wordsfre;
wordsfre = KeywordExtractionAPI.MyFun.StatisticsWords(TheDoc);
foreach (KeywordExtractionAPI.WORDSFRE wf in wordsfre)
{
wf.EntropyDifference_Max();
}
KeywordExtractionAPI.MyFun.QuickSort(wordsfre, 0, wordsfre.Length -
1);
Console.WriteLine("TheDoc:\n" + TheDoc);
Console.WriteLine("\nwordsfre:");
foreach (KeywordExtractionAPI.WORDSFRE wf in wordsfre)
{
Console.WriteLine(wf.Word + "\t" + wf.ED);
}
Console.ReadKey();
}
}
}
2. Class WORDSFRE
2.2. EntropyDifference_Max
Statement
bool EntropyDifference_Max();
WORDSFRE
Return Value
Parameter
None
Explanation
Example
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Runtime.InteropServices;
using KeywordExtractionAPI;
namespace KEDLL
{
class Program
{
static void Main(string[] args)
{
string TheDoc = "origin species introduction board hms beagle
naturalist struck distribution organic inhabiting south america geological
relations past inhabitants continent seen latter chapters volume throw light origin
species mystery mysteries called philosophers return home occurred 1837 question
patiently accumulating reflecting sorts possibly bearing five allowed speculate
subject drew short notes enlarged 1844 sketch conclusions probable period day
steadily pursued object hope excused entering personal details hasty coming decision
1859 nearly finished complete health strong urged publish abstract especially
induced wallace studying natural history malay archipelago arrived exactly
conclusions origin species 1858 sent memoir subject request forward sir charles
lyell sent linnean society published third volume journal society sir lyell dr hooker
latter read sketch 1844 honoured thinking advisable publish wallaces excellent
memoir brief extracts manuscripts abstract publish necessarily imperfect references
authorities statements trust reader reposing confidence accuracy doubt errors crept
hope cautious trusting authorities conclusions arrived illustration hope suffice
feel sensible necessity hereafter publishing detail references conclusions grounded
hope future am aware scarcely single discussed volume adduced apparently leading
conclusions directly opposite arrived fair result obtained stating balancing
arguments question impossible regret space prevents satisfaction acknowledging
generous assistance received naturalists personally unknown opportunity pass
expressing deep obligations dr hooker fifteen aided stores knowledge excellent
judgment considering origin species conceivable naturalist reflecting mutual
affinities organic embryological relations geographical distribution geological
succession conclusion species independently created descended varieties species
nevertheless conclusion founded unsatisfactory shown innumerable species
inhabiting world modified acquire perfection structure coadaptation justly excites
admiration naturalists continually refer external conditions ";
KeywordExtractionAPI.WORDSFRE[] wordsfre;
wordsfre = KeywordExtractionAPI.MyFun.StatisticsWords(TheDoc);
foreach (KeywordExtractionAPI.WORDSFRE wf in wordsfre)
{
wf.EntropyDifference_Max();
}
KeywordExtractionAPI.MyFun.QuickSort(wordsfre, 0, wordsfre.Length -
1);
Console.WriteLine("TheDoc:\n" + TheDoc);
Console.WriteLine("\nwordsfre:");
foreach (KeywordExtractionAPI.WORDSFRE wf in wordsfre)
{
Console.WriteLine(wf.Word + "\t" + wf.ED);
}
Console.ReadKey();
}
}
}
2.3. EntropyDifference_Normal
Statement
bool EntropyDifference_Normal();
WORDSFRE
Return Value
Parameter
None
Explanation
EntropyDifference_Normal . This function is used to calculate the word entropy
difference based on the general entropy
Example
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Runtime.InteropServices;
using KeywordExtractionAPI;
namespace KEDLL
{
class Program
{
static void Main(string[] args)
{
string TheDoc = "origin species introduction board hms beagle
naturalist struck distribution organic inhabiting south america geological
relations past inhabitants continent seen latter chapters volume throw light origin
species mystery mysteries called philosophers return home occurred 1837 question
patiently accumulating reflecting sorts possibly bearing five allowed speculate
subject drew short notes enlarged 1844 sketch conclusions probable period day
steadily pursued object hope excused entering personal details hasty coming decision
1859 nearly finished complete health strong urged publish abstract especially
induced wallace studying natural history malay archipelago arrived exactly
conclusions origin species 1858 sent memoir subject request forward sir charles
lyell sent linnean society published third volume journal society sir lyell dr hooker
latter read sketch 1844 honoured thinking advisable publish wallaces excellent
memoir brief extracts manuscripts abstract publish necessarily imperfect references
authorities statements trust reader reposing confidence accuracy doubt errors crept
hope cautious trusting authorities conclusions arrived illustration hope suffice
feel sensible necessity hereafter publishing detail references conclusions grounded
hope future am aware scarcely single discussed volume adduced apparently leading
conclusions directly opposite arrived fair result obtained stating balancing
arguments question impossible regret space prevents satisfaction acknowledging
generous assistance received naturalists personally unknown opportunity pass
expressing deep obligations dr hooker fifteen aided stores knowledge excellent
judgment considering origin species conceivable naturalist reflecting mutual
affinities organic embryological relations geographical distribution geological
succession conclusion species independently created descended varieties species
nevertheless conclusion founded unsatisfactory shown innumerable species
inhabiting world modified acquire perfection structure coadaptation justly excites
admiration naturalists continually refer external conditions ";
KeywordExtractionAPI.WORDSFRE[] wordsfre;
wordsfre = KeywordExtractionAPI.MyFun.StatisticsWords(TheDoc);
foreach (KeywordExtractionAPI.WORDSFRE wf in wordsfre)
{
wf. EntropyDifference_Normal();
}
KeywordExtractionAPI.MyFun.QuickSort(wordsfre, 0, wordsfre.Length -
1);
Console.WriteLine("TheDoc:\n" + TheDoc);
Console.WriteLine("\nwordsfre:");
foreach (KeywordExtractionAPI.WORDSFRE wf in wordsfre)
{
Console.WriteLine(wf.Word + "\t" + wf.ED);
}
Console.ReadKey();
}
}
}