Digitizing Documents and Repository

Question

Post reply

Digitizing Documents and Repository

Gavin Heer

SSChasing Mays

Points: 646
More actions
November 28, 2008 at 9:46 am

#198231

Just want to gather some ideas. Our company are looking at going paperless in the near future and digitizing documents in our Finace/Legal departments and beyond. This does not involve multimedia at this time. I'm imagining were looking at ~10-20TB of information to store in a database repository, what would be the best repository for this purpose ie SQL or oracle?? SharePoint would be the front end driving force. We are currently a MS SQL shop running off clustered SQL 2005 with SAN LUN's attached to each node. I believe 2TB is the maximum size of a LUN on any given node. That's what we've configured here. Does anyone have any suggestions on how best to store all this data. I'm sure Search Crawls will be tedious on databases with over 2TB off data. We are running 64-bit server hardware and compression of data whether stored in most likely XML format will play a factor to determine the design needs. SQL Server 2008 has some of these advantages but is it the best solution for Storing TeraBytes of database in a complete digital paperless environment. Any thoughts or ideas would be greatly appreciated...

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Jack Corbett SSC Guru Points: 184388 More actions · Answer 1

If you are planning on using Sharepoint and it's document management solution then you are tied to SQL Server since that is the backend of Sharepoint.

Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question

Gavin Heer SSChasing Mays Points: 646 More actions · Answer 2

Gavin Heer

SSChasing Mays

Points: 646

November 28, 2008 at 10:45 am

#904906

Thanks Jack for your response. If we are bound to SQL server then for SharePoint, is storage of Terabytes of data practical for SQL server to operate in norm or should we investigate a different front-end for a Document Management Solution??

Jack Corbett SSC Guru Points: 184388 More actions · Answer 3

I've never worked with databases that size, but with the proper configurations I don't know why SQL Server wouldn't scale to that. It's more of a storage issue than a querying issue I think at that point as it probably would not be highly transactional.

Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question

Cory Benefield SSC Enthusiast Points: 105 More actions · Answer 4

Cory Benefield

SSC Enthusiast

Points: 105

March 15, 2009 at 1:35 pm

#959963

SQL will scale via federated servers. Naturally, you will want to partition the DB and use multiple filegroups/Files so 2TB limit will not be an issue.

You may want to start federated so the foundation is laid for growth.

Have used SQL for DB’s over 1TB without federating and it works fine, even better on 64bit.

Johan Bijnens SSC Guru Points: 134873 More actions · Answer 5

I would suggest to contact SQLCAT (ms sqlserver customer advisory team) !

They may have more experience with that kind of volume(s) and may be in close contact with the sharepoint team.

Johan

Learn to play, play to learn !

Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:

- How to post Performance Problems
- How to post data/code to get the best help[/url]

- How to prevent a sore throat after hours of presenting ppt

press F1 for solution, press shift+F1 for urgent solution 😀

Need a bit of Powershell? How about this

Who am I ? Sometimes this is me but most of the time this is me

DNA_DBA SSCrazy Eights Points: 8974 More actions · Answer 6

DNA_DBA

SSCrazy Eights

Points: 8974

May 21, 2009 at 2:07 am

#998110

We have Sharepoint and the total db size is approaching 1Tb. It runs on a 4 cpu db server with 32Gb RAM. However, there are many databases that make up our Sharepoint system and we are intending to split it out across multiple database servers and web farms. We see heavy disk access and store many documents in it.

Ed Zann SSCrazy Eights Points: 8694 More actions · Answer 7

We use a Content Management system called ApplicationXtender to manage electronic documents. In our case, well over half of our documents come from external sources and the paper must be scanned and indexed. Only the document index data is stored in the SQL database. The actual document files are either on magnetic or optical media in TIF format. It would be impractical and unnecessary to OCR and fulltext index all of these externally sourced documents. We tag each document with 5 - 10 identifiers that can be used to retrieve the document - fields like acct#, taxid, name, doc type, doc date, etc.