Sybase NNTP forums - End Of Life (EOL)

The NNTP forums from Sybase - forums.sybase.com - are now closed.

All new questions should be directed to the appropriate forum at the SAP Community Network (SCN).

Individual products have links to the respective forums on SCN, or you can go to SCN and search for your product in the search box (upper right corner) to find your specific developer center.

Querying Long Varchar fields

4 posts in General Discussion Last posting was on 2008-05-08 17:28:47.0Z
Terry Williams Posted on 2008-05-08 01:58:49.0Z
Sender: 5a05.48225c8d.1804289383@sybase.com
From: Terry Williams
Newsgroups: ianywhere.public.general
Subject: Querying Long Varchar fields
X-Mailer: WebNews to Mail Gateway v1.1t
Message-ID: <48225e59.5a45.1681692777@sybase.com>
NNTP-Posting-Host: 10.22.241.41
X-Original-NNTP-Posting-Host: 10.22.241.41
Date: 7 May 2008 18:58:49 -0700
X-Trace: forums-1-dub 1210211929 10.22.241.41 (7 May 2008 18:58:49 -0700)
X-Original-Trace: 7 May 2008 18:58:49 -0700, 10.22.241.41
Lines: 21
Path: forums-1-dub!not-for-mail
Xref: forums-1-dub ianywhere.public.general:6843
Article PK: 5079

I have an ASA 7 database that is about 10GB. One of the
tables has a column that contains a long varchar. We store
about 2K of text in this column. We provide numerous types
of searches based on other fields that are parsed out of
this 2K long varchar but we also allow the user to do an
exhaustive search on any text in the 2K column. In order to
do this we must use an sql statement that looks like

WHERE honkin_text_field LIKE %sometext%

This is not fast at all. We have had to limit the records
it searches based on another indexed field. My question is,
what is the best way to query a non indexed 2K field for
something that could appear anywhere in the field? Is there
another product we should use to do this or is there some
type of strategy to use? Our users want to be able to
return all records that match their wildcard text string and
they don't want to have to break up their queries. There is
a ton of records in the table, each about 2k apiece.

Thanks!


Mark Culp Posted on 2008-05-08 13:14:18.0Z
Message-ID: <4822FCAA.5D3A5833@iAnywhere.com>
From: Mark Culp <reply_to_newsgroups_only_please_nospam_mark.culp@iAnywhere.com>
X-Mailer: Mozilla 4.75 [en] (Windows NT 5.0; U)
X-Accept-Language: en
MIME-Version: 1.0
Newsgroups: ianywhere.public.general
Subject: Re: Querying Long Varchar fields
References: <48225e59.5a45.1681692777@sybase.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
NNTP-Posting-Host: vip152.sybase.com
X-Original-NNTP-Posting-Host: vip152.sybase.com
Date: 8 May 2008 06:14:18 -0700
X-Trace: forums-1-dub 1210252458 10.22.241.152 (8 May 2008 06:14:18 -0700)
X-Original-Trace: 8 May 2008 06:14:18 -0700, vip152.sybase.com
Lines: 49
Path: forums-1-dub!not-for-mail
Xref: forums-1-dub ianywhere.public.general:6844
Article PK: 5080

In ASA 7, evaluation of like predicates were (are) limited to
doing a table scans, testing each row for a match. (assuming
there were no other predicates that could be used to limit
the search space).

I think it was around ASA 8 (or one of the 8.0.x releases) that
patterns of the form 'prefix%' could also use an index scan
and hence would run much faster.

In SA 11, we have added full text search capabilities...
which means that you can build a text index on a column (or columns),
and then run queries that will search for terms within the text
of the column(s).

If you want to try out SA 11, you can get the beta by signing
up for it at
http://www.sybase.com/sqlany_panorama_beta_registration
--
Mark Culp
SQLAnywhere Research and Development
iAnywhere Solutions Engineering
-------------------------------------------------------------------------
-- SQL Anywhere Blog Center - http://www.sybase.com/sqlanyblogs
-- SQL Anywhere Developer Community:
-- http://www.sybase.com/developer/library/sql-anywhere-techcorner
-------------------------------------------------------------------------

Terry, Williams wrote:
>
> I have an ASA 7 database that is about 10GB. One of the
> tables has a column that contains a long varchar. We store
> about 2K of text in this column. We provide numerous types
> of searches based on other fields that are parsed out of
> this 2K long varchar but we also allow the user to do an
> exhaustive search on any text in the 2K column. In order to
> do this we must use an sql statement that looks like
>
> WHERE honkin_text_field LIKE %sometext%
>
> This is not fast at all. We have had to limit the records
> it searches based on another indexed field. My question is,
> what is the best way to query a non indexed 2K field for
> something that could appear anywhere in the field? Is there
> another product we should use to do this or is there some
> type of strategy to use? Our users want to be able to
> return all records that match their wildcard text string and
> they don't want to have to break up their queries. There is
> a ton of records in the table, each about 2k apiece.
>
> Thanks!


Terry Williams Posted on 2008-05-08 13:24:42.0Z
Sender: 5a05.48225c8d.1804289383@sybase.com
From: Terry Williams
Newsgroups: ianywhere.public.general
Subject: Re: Querying Long Varchar fields
X-Mailer: WebNews to Mail Gateway v1.1t
Message-ID: <4822ff1a.7132.1681692777@sybase.com>
References: <4822FCAA.5D3A5833@iAnywhere.com>
NNTP-Posting-Host: 10.22.241.41
X-Original-NNTP-Posting-Host: 10.22.241.41
Date: 8 May 2008 06:24:42 -0700
X-Trace: forums-1-dub 1210253082 10.22.241.41 (8 May 2008 06:24:42 -0700)
X-Original-Trace: 8 May 2008 06:24:42 -0700, 10.22.241.41
Lines: 62
Path: forums-1-dub!not-for-mail
Xref: forums-1-dub ianywhere.public.general:6845
Article PK: 5081

Thanks for info Mark. Would an index on a 2K column be a
problem? I know it is hard to estimate without exact
hardware parameters but if we were to go to ASA 11 and
create the index on the 2k text column, how long would it
take to search a million 2K records using the LIKE
%sometext% clause assuming we had a massive machine with the
cache configured to equal the size of the DB (10GB)? Are we
talking a couple seconds or minutes?

Thanks,

> In ASA 7, evaluation of like predicates were (are) limited
> to doing a table scans, testing each row for a match.
> (assuming there were no other predicates that could be
> used to limit the search space).
>
> I think it was around ASA 8 (or one of the 8.0.x releases)
> that patterns of the form 'prefix%' could also use an
> index scan and hence would run much faster.
>
> In SA 11, we have added full text search capabilities...
> which means that you can build a text index on a column
> (or columns), and then run queries that will search for
> terms within the text of the column(s).
>
> If you want to try out SA 11, you can get the beta by
> signing up for it at
> http://www.sybase.com/sqlany_panorama_beta_registration
> --
> Mark Culp
> SQLAnywhere Research and Development
> iAnywhere Solutions Engineering
> ----------------------------------------------------------
> --------------- -- SQL Anywhere Blog Center -
> http://www.sybase.com/sqlanyblogs -- SQL Anywhere
> Developer Community: --
>
http://www.sybase.com/developer/library/sql-anywhere-techcorner
> ----------------------------------------------------------
> --------------- Terry, Williams wrote:
> >
> > I have an ASA 7 database that is about 10GB. One of the
> > tables has a column that contains a long varchar. We
> > store about 2K of text in this column. We provide
> > numerous types of searches based on other fields that
> > are parsed out of this 2K long varchar but we also allow
> > the user to do an exhaustive search on any text in the
> > 2K column. In order to do this we must use an sql
> > statement that looks like
> > WHERE honkin_text_field LIKE %sometext%
> >
> > This is not fast at all. We have had to limit the
> > records it searches based on another indexed field. My
> > question is, what is the best way to query a non indexed
> > 2K field for something that could appear anywhere in the
> > field? Is there another product we should use to do
> > this or is there some type of strategy to use? Our
> > users want to be able to return all records that match
> > their wildcard text string and they don't want to have
> > to break up their queries. There is a ton of records in
> > the table, each about 2k apiece.
> > Thanks!


Mark Culp Posted on 2008-05-08 17:28:47.0Z
Message-ID: <48233850.E851E056@iAnywhere.com>
From: Mark Culp <reply_to_newsgroups_only_please_nospam_mark.culp@iAnywhere.com>
X-Mailer: Mozilla 4.75 [en] (Windows NT 5.0; U)
X-Accept-Language: en
MIME-Version: 1.0
Newsgroups: ianywhere.public.general
Subject: Re: Querying Long Varchar fields
References: <4822FCAA.5D3A5833@iAnywhere.com> <4822ff1a.7132.1681692777@sybase.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
NNTP-Posting-Host: vip152.sybase.com
X-Original-NNTP-Posting-Host: vip152.sybase.com
Date: 8 May 2008 10:28:47 -0700
X-Trace: forums-1-dub 1210267727 10.22.241.152 (8 May 2008 10:28:47 -0700)
X-Original-Trace: 8 May 2008 10:28:47 -0700, vip152.sybase.com
Lines: 91
Path: forums-1-dub!not-for-mail
Xref: forums-1-dub ianywhere.public.general:6846
Article PK: 5082

The way that the like predicate is handled has not changed
betweeen 10 and 11 (there may have been some minor changes?),
so if you go to 11, you will want to rethink the way that you
generate the query.

A text index stores complete positional information for every
instance of every term in every indexed column. A 'term' is
generally the same as what we typically think of as a 'word',
but you have some control over how the the terms are choosen.

I cannot comment on the performance characteristic of your
application (once written) because there are numerous factors
upon which this will depend.

But, for example, we have an internal database that has 200000
rows and a text index on a column which varies in size between
0 and 1 million character (average size is 2200 chars), and
a fairly complex text search completes in sub-second time.

You mileage will vary.

If upgrading to SA 11 is an option, I would hightly recommend
that you get the beta and read through the text index documentation
to learn more about this topic.

- Mark

Terry, Williams wrote:
>
> Thanks for info Mark. Would an index on a 2K column be a
> problem? I know it is hard to estimate without exact
> hardware parameters but if we were to go to ASA 11 and
> create the index on the 2k text column, how long would it
> take to search a million 2K records using the LIKE
> %sometext% clause assuming we had a massive machine with the
> cache configured to equal the size of the DB (10GB)? Are we
> talking a couple seconds or minutes?
>
> Thanks,
>
> > In ASA 7, evaluation of like predicates were (are) limited
> > to doing a table scans, testing each row for a match.
> > (assuming there were no other predicates that could be
> > used to limit the search space).
> >
> > I think it was around ASA 8 (or one of the 8.0.x releases)
> > that patterns of the form 'prefix%' could also use an
> > index scan and hence would run much faster.
> >
> > In SA 11, we have added full text search capabilities...
> > which means that you can build a text index on a column
> > (or columns), and then run queries that will search for
> > terms within the text of the column(s).
> >
> > If you want to try out SA 11, you can get the beta by
> > signing up for it at
> > http://www.sybase.com/sqlany_panorama_beta_registration
> > --
> > Mark Culp
> > SQLAnywhere Research and Development
> > iAnywhere Solutions Engineering
> > ----------------------------------------------------------
> > --------------- -- SQL Anywhere Blog Center -
> > http://www.sybase.com/sqlanyblogs -- SQL Anywhere
> > Developer Community: --
> >
> http://www.sybase.com/developer/library/sql-anywhere-techcorner
> > ----------------------------------------------------------
> > --------------- Terry, Williams wrote:
> > >
> > > I have an ASA 7 database that is about 10GB. One of the
> > > tables has a column that contains a long varchar. We
> > > store about 2K of text in this column. We provide
> > > numerous types of searches based on other fields that
> > > are parsed out of this 2K long varchar but we also allow
> > > the user to do an exhaustive search on any text in the
> > > 2K column. In order to do this we must use an sql
> > > statement that looks like
> > > WHERE honkin_text_field LIKE %sometext%
> > >
> > > This is not fast at all. We have had to limit the
> > > records it searches based on another indexed field. My
> > > question is, what is the best way to query a non indexed
> > > 2K field for something that could appear anywhere in the
> > > field? Is there another product we should use to do
> > > this or is there some type of strategy to use? Our
> > > users want to be able to return all records that match
> > > their wildcard text string and they don't want to have
> > > to break up their queries. There is a ton of records in
> > > the table, each about 2k apiece.
> > > Thanks!