Monday, November 22, 2004

SQL Server Integration Services: text mining

First, what is SQL Integration Services? It's just the new name in SQL Server 2005 for DTS.

Donald Farmer has an article, Things to try with SQL Server Integration Services: text mining, about two new objects in this new version, Term Lookup and Term Extraction:

Term Extraction enables you to retrieve the key terms from a Unicode string or text column. It scores terms based on a number of factors, including its English grammar and syntax (only English at the moment), but also parameters you can tune yourself such as the length of phrases and their occurrence. The output is a single column containing all the key terms found - and this output can be sorted, aggregated, routed and saved like any other data.



Term Lookup is, in a sense, the opposite functionality. If you have a reference table with a column containing key terms, you can lookup from an incoming text field to a reference column in that table to see if it contains key terms found in the reference table. You should also identify a column to pass-through from the original source, which enables you to refer the results back to the incoming records.

Donald's focus o his blog is this newest version of DTS, and he has a number of other posts up already.

0 Comments:

Post a Comment

Links to this post:

Create a Link

<< Home