Copy link to clipboard
Copied
I need help with the most efficient way to search for same digits in ids dynamically.
After I queried my table: select MyId from Mytable then I need to check if none of these ids consist of the same repeating digits such as 0000,1111,2222,......9999
To make it more complicated, MyId doesn't have a fix length, it can be 4 digits length or 5, 6 all the way to 9. So the search of the same digits it this ids also should also correspond to the length of the ids.
If MyId is 4 digit long, I should make sure none of the MyId contains 0000,1111,2222,33333,...all the way to 9999
Can anyone give me example on how to do this efficienly since I may deal with large records.
Copy link to clipboard
Copied
Unless I misunderstand the question, just do this:
where MyID <> 9999
or whatever number you don't want to match.
Copy link to clipboard
Copied
Dan,
I can't do that since I don't know the length of MyId. If I know MyId is always 4 digits length, then your suggestion is working
Copy link to clipboard
Copied
Questions:
1. What database server, vendor and version, are you using?
2. What is the data type of the MyId field?
If your database server supports using regular expressions in the WHERE clause you might use a pattern like (\d)\1+ to search for repeating digits.
See: http://www.regular-expressions.info/charclass.html
Copy link to clipboard
Copied
It's Sybase, I think version 4.0 (?) and the datatype for MyId is varchar because it can accept alpha numeric. But we don't bother checking MyId other than 1111,2222,3333, all the way to 9999 if MyId happen to be 4 char length. Or 11111111,22222222,33333333, etc if it happens to be 8 char length.
Within 1 batch, when MyId is 4 char length, all will be 4 char length.
In tomorrow's batch, MyId can be 6 char length and next month it may be 4, 6 char length or 7 char lenght. Max char length for MyId is 9 char length.
So, within 1 batch there will be no mixing of MyId's char length.
I'm thinking of doing Len(MyId) first to find out the lenght of MyId for this batch, then looping to create 111,222,333,etc for the length determined by Len(MyId) and before moving to the next same digit, I do select MyId where MyId Like '%1111....%', then if recordcount is not 0 do something, then move to the next same digit and so on.
But I'm not sure how to write that efficiently when processing 5000 or more records for example.
Copy link to clipboard
Copied
Regarding
But I'm not sure how to write that efficiently when processing 5000 or more records for example.
Consider loading every record into a working table and processing them with sql.
Copy link to clipboard
Copied
I advise that you verify the version of Sybase you are using. Version 4 sounds very old and is likely incorrect.
You might follow Dan Bracuk's advise. Load your data into a SQL table and query the data there. It looks like Sybase (at least version 11) supports regular expressions in queries which could be used to query for patterns in your data.
Sybase Regular expressions:
http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.help.sqlanywhere.11.0.0/dbreference_en11/rf-sqllanguage-s-3888756.html
Copy link to clipboard
Copied
It looks like Sybase (at least version 11) supports regular expressions in queries which could be used to query for patterns in your data. Sybase Regular expressions:
http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.help.sql anywhere.11.0.0/dbreference_e...
It doesn't seem like Sybase regexes support back references, which is annoying. This is not based on any specific knowledge, but on no mention in the docs, and nothing on Google for search strings like "sybase regex backreference".
However this regex would work in CF if you wanted to pull the IDs back to CF and analyse them:
^(.)(?:\1{3,8})$
This matches strings of 4-9 length containing only the same character, but will match chars other than digits. If you only want to find "1111" but not "AAAA". then change it to this:
^(\d)(?:\1{3,8})$
That said, I'd try to process this all on the DB if poss, which might mean using something along the lines of what BKBK has suggested.
--
Adam
Copy link to clipboard
Copied
I know little about Sybase. I'm just guessing with:
select myID from Mytable
where NOT REGEXP_SEARCH(MyTable.myID, "[1]{4,9}|[2]{4,9}|[3]{4,9}|[4]{4,9}|[5]{4,9}|[6]{4,9}|[7]{4,9}|[8]{4,9}|[9]{4,9}")
Copy link to clipboard
Copied
If all else fails, you've always got the obvious, 'dumb' approach,
<cfset digitList = "'0000','1111','2222','3333','4444','5555','6666','7777','8888','9999','00000','11111','22222','33333','44444','55555','66666','77777','88888','99999','000000','111111','222222','333333','444444','555555','666666','777777','888888','999999','0000000','1111111','2222222','3333333','4444444','5555555','6666666','7777777','8888888','9999999','00000000','11111111','22222222','33333333','44444444','55555555','66666666','77777777','88888888','99999999','000000000','111111111','222222222','333333333','444444444','555555555','666666666','777777777','888888888','999999999'">
<cfquery>
SELECT myID FROM Mytable
WHERE myID NOT IN (#preserveSingleQuotes(digitList)#)
</cfquery>
Copy link to clipboard
Copied
Care to elaborate on "So the search of the same digits it this ids also should also correspond to the length of the ids"?
Copy link to clipboard
Copied
Dan, I just posted my explanation
Get ready! An upgraded Adobe Community experience is coming in January.
Learn more