When you are building a new website, you may want to show your customers or other stake holders a live preview of your website before actually going live. A staging website is a clone of your website that is not accessible to search engines and general public, but viewable by those whom you allow access. Staging websites can be used while making important changes to an existing website also. You may want to demo the changes before making the changes live.
Many hosting providers have the option of creating a staging environment within your hosting account cpanel, but this depends on your hosting package. If you already have a package with staging option, then the replication of files and database as well as restriction of access will be automated via the tool.
Even if you do not have a staging option you can still create a staging environment manually. You may stage your website in a subdomain of your main domain or within a separate path. E.g.: If you main domain is mydomain.com, then you may stage at demo.mydomain.com or mydomain.com/demo. You may also stage your website at any other domain or subdomain that you own. If your website is brand new then you may stage at the same space where your final website will go live. In the last case, you only need to lift the restrictions to make the site available.
Steps for creating staging environment manually are :
1. Block access to search engines by robots.txt
You don't want search engines to crawl and index your website before it is ready. Also if you are cloning an existing site, you don't want your SEO to be affected by the duplicate content. One method to instruct search engines not to index your pages is by editing the robots.txt in the root folder where you are going to setup your staging. Add the following lines:
User-agent: * Disallow: / Noindex: /
This file robots.txt is a text file that instructs search engine bots how to crawl your website. You may write instruction for different user agents, but the wild card (*) covers all. The above lines disallows all bots from crawling your website or indexing the pages.
If your staging path is a subfolder like mydomain.com/demo, then you only need to block that path.
User-agent: * Disallow: /demo/ Noindex: /demo/
Note that these directives are not 100% honoured by all search engines. You may want to take additional htaccess based restrictions to make sure your sites are not accessible to search engines.
2. Restrict access to public as well as bots by htaccess
If you are on a shared hosting account running Apache Web Server, then you may use your .htaccess file to restrict access to your website. The .htacces file is a distributed config file, which means that each directory can have a .htaccess file that sets access rules for that directory. If you are running Nginx, IIS or other web server then there is an equivalent config that can achieve the same, but here we are discussing only the Apache htaccess method.
There are two approaches to restricting access - password protection and IP based restriction. You may use one of these, or ideally both for more peace of mind.
a. Password protection
In this method, when someone accessed your staging website they will be prompted for a password. To do this, add the following lines to your .htaccess file.
AuthType Basic AuthName "Protected Area" AuthUserFile /mypath/.htpasswd Require valid-user
The .htpasswd file stores the usernames and passwords for all the allowed users. You can also specify the desired path instead of /mypath/ . After this you need to create the .htpasswd
file in the location and add usernames and passwords in the [username]:[password] format. Sample entries:
adam:7dam1980! bob:er$b$b2007*
For additional protection you may encrypt these password in MD5 using a tool. The equivalent entries after encryption would look like:
adam:$apr1$Mpp1qZCO$wrmoYmISq8sFIAIWoQttO1 bob:$apr1$udvzdHhQ$M6trLdUeSe7okbymWm59A0
Note that this method also disallows crawlers because contents can be only accessed after password authentication.
b. IP Address Restriction
A stricter way of restricting access is by specifying the IP Addresses from which the site can be accessed. This is particularly suitable if you and your clients use static public IP addresses. In this case there is a single external IP for all computers within the internal office network. You can add the IP addresses to .htaccess file. Lets say your external IP is 123.123.123.1 and clients is 123.456.456.10 then the sample entry would look like:
order allow,deny allow from 123.123.123.1 allow from 123.456.456.10 deny from all
3. Copy files and database
Now you may transfer your files and database to make a clone of your website in the preferred staging location. Files can usually be just copied, but databases may need to be migrated properly. If you are using a CMS or E-commerce platform, use the appropriate method for cloning. For example in WordPress you need to migrate database using a plugin such as WP Migrate DB, otherwise everything will point to the old URLs. Choose a migration tool or plugin appropriate to your platform.
Once you have the restrictions in place, bots and general public are not allowed to access your website any more. If you haven't put these measures in place, your staging site may already have been indexed. In that case you can remove the previously indexed pages from Google using Google's URL Removal Tool avaailable for Webmasters. You can find this tool in Search Console. Happy Staging!!