Storing PDF and JPEG file in SQL Server

  • I have an application that stores and retrieves a lot of documents which I store in folders. This is working well until the number of documents getting too big. Does any one knows whether or not I can store the PDF and JPEG files in SQL Server instead of leaving them in a folder? and whether or not this is the right approach to handle store & retrieve a large number of documents? Thank you in advance for all your help. KP

  • You can the data type is image, Please read up on it in books online. Find books that talke about this. It works fine, but not an ideal file management system. Also note are there alot of deletes?

    I'm not sure if its still a problem but sql server does not like to give back the space used by an image row. Even after its deleted.

     

  • Thanks Ray!! Delete will be minimum, therefore it is not an issue. You don't think SQL is an ideal file management system, then which one qould you recommend for me to look at?

  • Sql server is a rdbms system. Not a document management system.

    You can certainly use sql to store your images/pdfs. it works fine. But there are some inherent issues. its doable, just make sure you read up on the data type and how to handle them. It is alot more work to read data from the row. and the storage is different.

     

  • I agree with Ray, SQL Server is not a good document management system. When I have little documents I use SQL Server to store the path and other attributes of the documents like the list of users that opened it.

    But, if you have to manage great amount of documents you have to use a documental database like Exchange or Lotus Notes. In fact, what I use is a mix among sqlserver and Exchange. Exchange is used as a document repository with a tree structure (like folders) and sqlserver as an attribute storage.

  • Sergei,

    Do you have any data that suggests that MSSS is bad for this purpose, or any data that Exchange or Lotus Notes are better in some way?  Or any data that that shows that storage in the database is "bad" ?  I hear these arguments over and over, but I have never seen anybody offer any data.

    Personally, I have tried to work with linked files: it is not easy, and teh system is prone to corruption over time.  It might be a bit faster in the short run, but over time it turns into a major headache.  Add auditing to the mix, and you have a real major project on your hands. 

    Working with files within the database is almost trivial, in contrast.  Sure, it might take a bit longer to retrieve and store the file, but users rarely need one than one record at a time.  It need not take any more disk space than a standard file storage system.   Of course, it is also possible to zip and unzip the files on the fly for more compact storage in the database. 

    So leaving all the dogma and hype aside, is there any data available that compares these different approaches in a real-world scenario?  I certainly have not seen any.

  • In our doc management system we offer both FTP-based storage (files are effectively part of a file system though) or BLOB-based storage.  I personally prefer the BLOB based storage as it is an extra level of abstraction and removes the documents from where they may be deleted on disk, etc.  Also, network shares and other messy things needn't be created.  Backups are just of a database, not a database plus a file system.

    Yes, there are some downsides (eg MSDE gives you 2GB per DB - need to create new DBs as needed to store the BLOBs), but it certainly helps lock the documents up.  Since ours is a medical application, it is a sales plus if we have the documents away from parts of the file system where users may delete them even if they can't read the encrypted files (many small medical practices have every as administrator - YUK!)

  • I've been using SS2K for a while now as a document repository with no problems.  One DB has 1800 documents (PDF, Word, etc) totaling 4.5GB and another is storing 52,000 XML documents totaling over 10GB.  These DBs are on the same server which isn't a massive powerhouse.  The data is fed/retrieved through both ASP websites and VB6 and VB.NET client applications.

    There is a document Frank Kalis always used to post whenever anyone asked about this topic.  I think it's a link to something on MSDN.  Might want to search Frank's posts to find it, it was really helpful.

  • Thank you all for your responses. Eventhough there is no clear answer for this issue, I think I will use Sql Server to store the document after I convert it to PDF (one of the good thing with our application is that after the document is stored, the user can only view it without the ablity to edit). PDF file is compact by nature and hopefully this will work out well. I tried this method with a couple document yesterday and it seems to work fine. Let's see how this goes in production...Thank you all!!!

  • SQL Server not good for document storage?

    Don't tell that to the SharePoint development team, Visual Studio Team System team, or the WinFS team.  All those MS projects/products use SQL Server to store files.  Also, I bet you can't convince the dozens of document management apps that use rdbms's to store documents.

    It is appropriate to say SQL Server does not have good native document storage capability.  But with a front-end application, documents can definitely be stored in SQL Server (as Ray M points out, use the image datatype).  It can be argued that you will save space (as you will not have wasted space in file clusters), more reliable (not dependent on a file cluster chain), and provides better backup and recovery.

     



    Mark

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply