Auto-increment an ID with sequence and trigger: How to handle manual overrides (causing non-unique error)

by Wilson   Last Updated June 28, 2017 22:06 PM

I have a sequence and trigger that provide auto-incrementing IDs to a PROJECT_ID column:

PROJECTS
+------------+-----+
| PROJECT_ID | ... |
+------------+-----+
| LC000001   |     |
| LC000002   |     |
| LC000003   |     |
| LC000004   |     |
| LC000005   |     |
+------------+-----+

CREATE SEQUENCE SEQ_PROJECT START WITH 1 INCREMENT BY 1

CREATE OR REPLACE TRIGGER "PROJECTS_PROJECT_ID_TRIG" BEFORE INSERT ON PROJECTS
    FOR EACH ROW
    WHEN (NEW.PROJECT_ID IS NULL) 
      BEGIN
       :NEW.PROJECT_ID := 'LC' ||to_char(SEQ_PROJECT.NEXTVAL,'FM000000');  
      END;

Source: Trigger to get next prefixed ID: Is there a better way?

The sequence and trigger work just fine under normal circumstances.

However, the editing environment in my application is far from normal. Users edit the table by viewing the table as though it were an Excel spreadsheet (holy smokes!). The scenario is called "editing in the attribute table window". This means users can manually edit any column in any row. Which means that users can override the IDs that are generated by the sequence and trigger.

For example:

  1. A user creates a new row. The sequence and trigger provide an ID of LC000006.
  2. A user then goes and manually changes LC000001 to LC000007. (I can't think of a good reason why a user would do this, but they have the ability, which means it will probably happen.)
  3. A user creates a new row. The sequence and trigger provide an ID of LC000007.

    Error: Database Row Change: An unexpected failure occured. Underlying DBMS error [ORA-00001: unique constraint (USER1.PROJECT_ID_IDX) violated][USER1.PROJECTS]

The error is thrown because there is a unique index on the field.

Practically speaking, I can live with the fact that my users would get an error like this (if someone on the team was unwise enough to do such a thing). All they really have to do is try again, and the sequence/trigger just magically skips to the next number.

But of course, there's also the possibility that there are 10's or 100's of IDs that are out-of-sequence. Believe it or not, this is a real possibility, due to the existence of a tool called the field calculator. Getting an error message for every out-of-sequence ID doesn't seem so practical anymore.

So, I'm curious, can I avoid this problem somehow? How do the pro's normally handle this kind of thing?



Answers 3


Assuming I'm following you correctly, you need to deal with using a sequence to populate an ID field where there's already existing data. You want to make sure that the new ID values generated don't conflict with existing values.

In my experience, you have three options:

  • Ignore any numbers that have been skipped over; find the current highest number, and start your sequence there;
  • Re-number the existing rows so all current ID values are contiguous (a solid block: for instance, 4012 records, with IDs LC000001 - LC004012);
  • Instead of using a sequence, maintain a table of available ID values, and assign new numbers out of that.

First, you need to evaluate the state of your existing values. Your ID allows for numbers between 1 and 999,999. Currently, what's the highest number in use, how many rows are there, and how much activity does that represent?

Let's consider a few different scenarios:

  • Highest number = LC010203; row count = 8192

    Recommendation: set the sequence to start with 10204, and don't worry about the ~2000 possible numbers you've skipped. They represent 0.2% of the total IDs available - that's almost certainly too little to be worth messing with.

  • Highest number = LC200001; row count = 5,120; data represents 5 years of business

    ** Recommendation**: again, you can probably skip the unused values and start with 200002. Yes, you've lost almost 20% of your potential numbers, but even if you average 5 times as much business going forward as you've seen in the past 5 years, what's left should hold you for over 30 years.

    NOTE: you may have a special case that makes the loss of the "skipped" values irrelevant. For example, if you use a new two character prefix every year, you may have the option of restarting your ID sequence at the start of every years (as long as you're actually storing LC200001, not just storing 200001 and formatting it by putting the LC in front of it in the application and in reporting tools.

  • Highest number = 'LC909090', row count = 5,000, covers 1 year in business

    Recommendation: OK, you've got a problem, and you probably can't just start from 909091; you'd be likely to run out of numbers in two years or less.

I've been there.

At that point, the simplest option is to renumber all the existing records, so your 5,000 rows with IDs ranging from LC000123 to LC909090, become 5,000 rows with ID values ranging from LC000001 to LC005000. If the ID value is used as a foreign key, you've got to identify the foreign keys (some may not be declared as such in the DB structure); at least temporarily disable whatever mechanisms enforce data integrity; and then update all the values (in both the local and the remote tables), maintaining that integrity in your code alone.

However, you may be in even worse shape. If the ID value is used by people (whether it's how everyone in your organization identifies the "thing", or if it's the number your customers write on their payment slips to match their payments correctly to their account), then changing it could be a very bad idea.

If you can't ignore them, and you can't reset them, then one remaining option is to determine what all the available numbers are, store those in a table, and use a process you create to assign the available numbers to new records.

I had to do something very like this, in a former position. We used a "numbers" table to list all the possible values; removed the ones in use; and saved the rest to a table. The table included a column noting whether the ID has been used or not (initially, all were unused)

When we needed to assign a new number, we'd run an UPDATE query, marking the lowest unused number as used, and using the RETURNING clause to get back the number marked (one atomic action avoided issues, since this was actually used on a web application to sign up new customers).

As painful as this is to setup, it can be a better option than changing all the existing IDs. Imagine getting hundreds or thousands of customer payments a day, and having to apply those payments to your customers' account manually, because you have to look up each customer by the name and address on the payment, because the customer number on the check and/or remit slip no longer matches what's in the system....

RDFozz
RDFozz
June 29, 2017 00:39 AM

Instead of trying to address what happens after someone updates the PROJECT_ID column, have you tried denying updates of the PROJECT_ID column in the first place?

Perhaps revoking update privileges on the PROJECT_ID column? Or perhaps ... yuck ... a trigger to rollback (and generate an error) any attempts to update the PROJECT_ID column?

Since you're ok with them getting an error when they attempt to generate a duplicate PROJECT_ID, I'm assuming you'd be ok with them getting an error when attempting to update PROJECT_ID ... and after a couple such attempts ... and perhaps a "WTF!?!?!" email to the DBA (you?) ... they'll get used to the fact that they can't update the PROJECT_ID column.

Or do you really want to allow them to update the PROJECT_ID column?

markp
markp
June 29, 2017 01:40 AM

From what I understood, It is not much complicated.

i) You have one unqiue column Project_ID ,which can be edited by user also.

ii)As a result there will be gap in sequence. Let it be,Since you can't change user requirement.I will catch the error number and show user friendly message instead. "This id is not available"

iii)I will add 2 extra column,if you don't already have.

iv) One column will hold ActualProjectID which will be in proper sequence and is unique and won't be edited by anyone.This column help me in showing project id in sequence where it is require.

v) Second column will be simple int auto increament.This column I will make it PK and also use it as FK purpose.

I am sure iv) column,but not so sure about v) column becasue I will have to do further analisys.

KumarHarsh
KumarHarsh
June 29, 2017 10:16 AM

Related Questions




Function - PostgreSQL 9.2

Updated April 25, 2016 08:02 AM

Prevent explicit insert into serial columns

Updated March 18, 2017 18:06 PM