Answered
Database Design Question
Currently we have a subscription service website which
manages time
and attendance for companies using a MySQL database. Currently all
companies and users are in the same database. Every company has a
companyid and every user has a userid and is joined to a company by
the companyid. We do all standard select, insert, update, delete
queries you would expect using the companyid and userid as keys for
joining and selecting records across tables. The companyid and userid
is looked up and stored in the session when a user logs in.
This has worked well until recently where there have been a few
concerns. One is the potential security risks of running all queries
on the companyid condition. While we take every precaution to not pass
the companyid in any forms or urls for sql injection if someone was
able to alter their session companyid in some freak way they could get
to the sensitive data of any other customer. Also, any coding errors
as some of our quries are quite complex have the potential for
altering other users data with careless bugs (and there have been a
few). More importantly though is the issue of locking data and
preventing race conditions. Currently there is a single admin per
company who can run a report for a record set, review those records
and mark the ones to be updated, and then submit it for processing. We
need to add the ability to have more than one admin with this ability.
Obviously, if two users then try to do this same this there a
potential race condition which may occur. On top of that, employees
have the ability to modify their records as well. If an admin runs a
report and is reviewing it and an employee updates a record obviously
the admin will not see it and someones data will be overwritten.
Obviously this is undesirable. We considered using read locks on the
table so that while an admin is reviewing records no one else could
update until they submit their action, obtain a write lock on the
table, update their data, and release the lock. The problem is that
locking the table will lock ALL users from the table not just the
users of the specific company. I looked into creating a view on the
specific data and locking the view but as far as I can tell from the
documentation, a lock on a view requires a lock on all the base tables
involved and now were back to square one.
On top of all this, recently we had a corruption in a MySQL table
which affected ALL users records. It was fixed easily enough with
MySQL's repair utilities but it affected service for all users for
almost 20 minutes.
So, we thought about changing our approach to create a new database
with every new company. This would ensure that anyone can ever access
only one company's data and prevent and possibility of cross data
contamination or security breaches. This would also allow table locks
to be employed as an admin could perform their actions with the
knowledge that they are essentially locking the system from their
employees until they are done, which is an acceptable compromise. Any
table corruption would affect only one company as well without
affecting the entire system. Also, as we grow tables, are growing
exponentially. We're in the 10's of mb for some tables so it's not an
issue at the moment but MySQL can only handle up to 4GB per table.
This would ensure plenty of growth per company before it ever became
an issue. And lastly, a single company's data can easily be backed up
and restored in it's entiretly with affecting the rest of the system.
We currently run hourly backups and in the event a database or table
needs to be restored a single company will lose at most one hour of
data instead of all companies in the system. In one case we are
already doing this of sorts. We have a customer on a dedicated machine
just for them in which we customized the software to fit their
specific needs with their own MySQL server and database.
The downsides include having to keep a master database to manage a
single signon solution which I think is there the real problem with
this plan is. All new users and companies would have to be inserted
into a master database table to generate unique ids and to store what
datasource to use when authenticating the user. Then there is the
overhead of creating an entire new database everytime a user signs up
with our system and updating the database structure across hundreds of
companies, potentially thousands, as we add features. It would mean
adding a new datasource for every new company. It would also prevent
the ability for us to track and monitor system statistics easily, like
the total number of users since it would be the the aggregate value of
one table in hundreds of databases.
Maybe this is a completely ridiculous solution but it seems like it
might have it's benefits (as well as trade offs). We're gearing up for
a potential ground up rewrite of our system and this would be the
place to start.
Any comments or suggestions for or against would be greatly
appreciated.
Anthony
and attendance for companies using a MySQL database. Currently all
companies and users are in the same database. Every company has a
companyid and every user has a userid and is joined to a company by
the companyid. We do all standard select, insert, update, delete
queries you would expect using the companyid and userid as keys for
joining and selecting records across tables. The companyid and userid
is looked up and stored in the session when a user logs in.
This has worked well until recently where there have been a few
concerns. One is the potential security risks of running all queries
on the companyid condition. While we take every precaution to not pass
the companyid in any forms or urls for sql injection if someone was
able to alter their session companyid in some freak way they could get
to the sensitive data of any other customer. Also, any coding errors
as some of our quries are quite complex have the potential for
altering other users data with careless bugs (and there have been a
few). More importantly though is the issue of locking data and
preventing race conditions. Currently there is a single admin per
company who can run a report for a record set, review those records
and mark the ones to be updated, and then submit it for processing. We
need to add the ability to have more than one admin with this ability.
Obviously, if two users then try to do this same this there a
potential race condition which may occur. On top of that, employees
have the ability to modify their records as well. If an admin runs a
report and is reviewing it and an employee updates a record obviously
the admin will not see it and someones data will be overwritten.
Obviously this is undesirable. We considered using read locks on the
table so that while an admin is reviewing records no one else could
update until they submit their action, obtain a write lock on the
table, update their data, and release the lock. The problem is that
locking the table will lock ALL users from the table not just the
users of the specific company. I looked into creating a view on the
specific data and locking the view but as far as I can tell from the
documentation, a lock on a view requires a lock on all the base tables
involved and now were back to square one.
On top of all this, recently we had a corruption in a MySQL table
which affected ALL users records. It was fixed easily enough with
MySQL's repair utilities but it affected service for all users for
almost 20 minutes.
So, we thought about changing our approach to create a new database
with every new company. This would ensure that anyone can ever access
only one company's data and prevent and possibility of cross data
contamination or security breaches. This would also allow table locks
to be employed as an admin could perform their actions with the
knowledge that they are essentially locking the system from their
employees until they are done, which is an acceptable compromise. Any
table corruption would affect only one company as well without
affecting the entire system. Also, as we grow tables, are growing
exponentially. We're in the 10's of mb for some tables so it's not an
issue at the moment but MySQL can only handle up to 4GB per table.
This would ensure plenty of growth per company before it ever became
an issue. And lastly, a single company's data can easily be backed up
and restored in it's entiretly with affecting the rest of the system.
We currently run hourly backups and in the event a database or table
needs to be restored a single company will lose at most one hour of
data instead of all companies in the system. In one case we are
already doing this of sorts. We have a customer on a dedicated machine
just for them in which we customized the software to fit their
specific needs with their own MySQL server and database.
The downsides include having to keep a master database to manage a
single signon solution which I think is there the real problem with
this plan is. All new users and companies would have to be inserted
into a master database table to generate unique ids and to store what
datasource to use when authenticating the user. Then there is the
overhead of creating an entire new database everytime a user signs up
with our system and updating the database structure across hundreds of
companies, potentially thousands, as we add features. It would mean
adding a new datasource for every new company. It would also prevent
the ability for us to track and monitor system statistics easily, like
the total number of users since it would be the the aggregate value of
one table in hundreds of databases.
Maybe this is a completely ridiculous solution but it seems like it
might have it's benefits (as well as trade offs). We're gearing up for
a potential ground up rewrite of our system and this would be the
place to start.
Any comments or suggestions for or against would be greatly
appreciated.
Anthony
