Well I was perusing the LinkedIn Groups this weekend while down from dental surgery and I found that a few discussion points were around the topic:
“I have a gazillion paper documents, and I want to put them into SharePoint”
My suggestion, don’t mix up the desktop shredder and the scanner, I only made that mistake once….
So here is some ideas on ways to get that process moving.
First lets cover tools. I am going to start with some well documented, and well liked commercial solutions:
Kofax, Websio, and KnowledgeLake these tools are extremely solid, and have been implemented on some of the biggest enterprise farms going today. I have experience with KnowledgeLake and Kofax. Have to say very impressed with the UI and the Support for the tools. The installation was tricky, but worth it in the end. Try them out, don’t take my word for it.
With commercial level tools like this you are going to pay a premium for licensing, and support each year. So weight that into your consideration. Remember once you implement a solution that is a pay for service, you have to consider what the cost of turning it off is…(rework, training, solution removal, refactoring, change management, etc.) this should be considered in your Total Cost of Ownership.
Custom Development – Codeplex
OK so let’s look at options your developers are going to love you for.
embDocumentInhalator – a codeplex solution has a few hundred implementations. I have tried it and have to say, I was impressed with the end solution. Now the base solution you download is boring, and looks well like garbage, so you will need to style it. Second I that you still need to have a scanner solution to leverage, I recommend http://de.wikipedia.org/wiki/Windows_Image_Acquisition seems to work with nearly everything I have thrown at it. However you can come up with a scanner solution specific to your monster Canon or Konica if you wish. In some cases someone has already so look around on the internets…lol
Option 2 for the custom folks is a Do It You Selves option. Using some fun .NET, Powershell commands you can do nearly anything. Just keep in mind if you build it, you own it. You own the maintenance, support, and long term life. Total Cost of Ownership, again.
OK so now let’s consider how in gods name you can do this OOB with $0.00 dollars other than you Enterprise License. Sorry folks if you have Foundation or Standard you need to consider the above two options.
So the trick is three steps:
Step 1: Content Type Classification
Why content types, well like in Comic Books “with great power comes great responsibility”, and I do not want you just scanning in a gazillion pdfs into my SharePoint. I know, how is a content type going to solve this. We are going to using auto classification to assist us with this. By creating a content type called “Scanned File” you now have a way to apply metadata, workflows, publishing, retention policy, IMP (information management policy) and a slew of other content type goodness.
Create the content type and plan to add it to the libraries I talk about in step 2.
Step 2: Scan to file
So most folks have a network copy/scan/fax machine, if you don’t “seriously save some money and buy one on e-bay”, we are going to use it to the max.
Recommend creating a Site Collection for this the first time, just incase you hate the results. Otherwise, any old document library will work. I like to enable email support, enable content types, and adjust some thresholds for this site to support large files, and a lot of them. Just keep in mind the file size of a few hundred Scanned PDF files. Apply the fun content type we create before.
Now we could spend a lot more time on settings, but naaa that is boring. Once you have the libraries in place (with some fun names like Scan Repository A, Scan DropBox) you have two options you can add those to your network as mapped network locations just like a file share, or create an associated email address, and configure the email settings on the site, to save file delete email. I like the email option, as this also lets you send in scanned documents from any location via email, glorified document FTP.
Step 3: Metadata
To make my life easier I also like to customize the EDIT.aspx and VIEW.aspx in InfoPath to give some UI support for the common user.
Step 4: Workflow/Powershell move of the file to its final home.
How can we do this with a OOB workflow you ask….No this is the part you want to break open Visual Studio. I have done this with SP Designer, but I prefer a robust approach here. Now if you want to distribute the files to alternate Site Collections, or convert the files to another format, you can do this thru Visual Studio, and it can be more sexy this way, allowing you to perform these advanced functions. Give your developer something to do, I do recommend this option instead of the SP Designer, as this can be a formal Site Feature, you can reuse. You can also write in the code to do your clean up and delete the original file, or preform this action as a move.
The other nice part about this in code, is that you can also re-classify to a content type in the destination if you, like.
Remember this is like creating a Custom Send To Location, and just running the code to push the button.
Step 5: Search, ensure your Index is running PDF iFilter, and has the home locations on a regular index routine.
So when someone asks can you do it, YEP and I can give you ten ways to Sunday on how-to. That is the SharePoint way right. Remember, look at your options and Total Cost. If this is Pilot or just for the Legal Team or HR team to get some contracts in the system I suggest look at Codeplex first. If you are trying to kill off that warehouse of banker boxes that you lease from Uncle Charlie, go Commercial you will get your monies worth. I can never say this enough, try before you buy.