Ultimate Server Maintenance Checklist [Updated for 2023]

In order for computer networks to function smoothly, and avoid downtime or losing data, you need to check servers for a range of performance criteria regularly. For organizations that have their own servers, this job lies with the network administrator. In other cases, MSPs (managed service providers) perform server maintenance and regularly monitor both server equipment and software to ensure optimal levels of operation.  

Note: Our Free Server Maintenance Checklist Template is published along with our other information technology checklist templates on the templates page of our app.

This post expands on and provides updated information on our Monthly Server Maintenance Checklist published in 2020.

What is Server Maintenance and What is a Maintenance Checklist?

Server maintenance is the process of monitoring and updating the server to make sure it is running properly and without errors. The maintenance includes a continuous collection of information about the server, its periodic analysis, and a regular set of actions necessary to avoid problems that may occur or fix them. Such actions include, but are not limited to, reviewing the server’s performance, ensuring that automated system monitoring utilities are properly installed and configured, identifying potential security risks, and regular data backup.

Besides fixing any existing errors, It’s crucial to understand that it’s more effective to conduct regular server preventive maintenance. This way the issues may be spotted before they cause any serious damage to your server and the business that depends on it. Preventive server maintenance also helps to prepare your network for the events when something does go wrong.

For conducting the maintenance, you need a server maintenance plan – that is a list of all the actions you need to do and when to do them. Some of the checks are only necessary each month, but others happen daily. The easiest way to make sure that nothing falls behind the schedule or slips out of your head, is to use server maintenance checklists.

Here are templates for the most common server maintenance checklists that will help you make the maintenance process easier, more structured, and controlled. 

“A good checklist is a reminder of the minimum necessary steps in a complex task. It should not be a substitute for a full understanding of the task, nor should it be a substitute for training and judgment. But it can be an invaluable aid.” – Atul Gawande, The Checklist Manifesto

Monthly Server Maintenance Checklist Template

Preparation

  • Name of Server
  • The static IP address of server computer
  • MAC address of server computer
  • Maintenance Date

Data, Software and System checks

  • Check backups are working
  • Check and update OS
  • Update your control panel
  • Check and update applications
  • Run database maintenance
  • Make sure backups are up to date.
  • Check for database corruption.
  • Make sure indexes are updated.
  • Check remote management tools
  • Remote console
  • Remote reboot
  • Rescue mode
  • Check Disk usage
  • Check CPU usage
  • Check RAM usage
  • Check Network usage
  • Free up server storage space

Security Checks

  • Change server passwords
  • Download and install the patches
  • Test the patches
  • Confirm that the patches are installed correctly
  • Review User accounts
  • Perform a server malware scan

Hardware Checks

  • Check fans and power supplies
  • Check RAID fault tolerance
  • Check cable integrity
  • Check A/C unit at the facility (we have a separate checklist below for reviewing the server room)

Get started with our server maintenance checklist in Manifestly here.

Windows Server Maintenance Checklist

An overview of items that can be specific to the Windows OS.

“Preventative maintenance is critical to ensuring the availability, reliability, and performance of your servers. By following a systematic approach and regularly checking and maintaining your servers, you can prevent unexpected downtime and keep your business running smoothly.” – Microsoft

Run a malware scan

Malware is a type of software that can cause harm to your computer. Malware can be installed on your computer by downloading files, clicking on links in emails or browsing the web. Malware can steal passwords and personal information and install other malware and viruses.

You should run a malware scan at least once a week to keep your system protected from malicious attacks by running an anti-virus program such as Windows Defender or Malwarebytes.

Clear temp files

What are temp files? Temp files are temporary files that can be found in the Windows directory and other locations on your computer. They are created when you use programs, browse the web, and perform other day-to-day tasks on your PC. These files are used temporarily by programs until they need to be saved in another location or deleted from your system entirely.

Temporary internet files (located in %WinDir%\Temp) keep track of what pages you visit while browsing the web so that you don’t have to download them again when revisiting a site with previously viewed content (e.g., if you had visited a website earlier but closed it before completing all required actions). This allows sites without cookies enabled or those relying solely on IP addresses instead of session IDs to work properly even after being closed without losing any data entered into forms prior to closing down their tabs/windows etcetera

Defragment your hard drive

  • Defragmenting your hard drive is an essential step in keeping your Windows Server in top shape.
  • Windows Server automatically defragments its own drives, but you can manually defragment any connected hard drives as well.
  • To manually defragment a drive, right-click My Computer and select Properties. Click the Tools tab, then click Defragment Now under Optimize Drives. You’ll see a list of all available drives; check them off one by one and click OK when you’re done.

To optimize performance and stability over time, we recommend running this process every month or so—but if you notice that your server has been misbehaving recently or taking unusually long to boot up, run it immediately!

Check for outdated drivers

It’s a good idea to check your drivers regularly, as they can cause performance issues. Driver updates can fix problems and keep your driver software up-to-date. It’s also important to know that unlike other parts of Windows, drivers are not automatically updated unless you enable Windows Update or manually run an update from the manufacturer’s website.

To check for outdated drivers:

  • Open Device Manager by pressing [Windows] + [R], typing devmgmt.msc into the box, and pressing [Enter].
  • Expand each device category until you see the specific device for which you want to find an updated driver (e.g., sound card). For example: Right-click on “Sound, video and game controllers” then choose “Update Driver Software…” from its context menu. This will open a window where you can browse for new drivers online or connect them from a CD/DVD drive or another computer on your network that might have them available locally instead of downloading them over an internet connection first before installing onto your PC directly through Windows Update; alternatively use Device Manager again instead here too!

Clean up the event log

It’s a good idea to periodically clean up your event logs. This will free up space, and it can help you avoid any confusion when looking through old logs for a specific problem. If you’re not sure what an event is, don’t worry about it—just leave it alone! But if there are large numbers of events in one category or they’re just taking up too much space, then it’s time to take action.

Check for unneeded services

Services are applications that run in the background, usually as a part of Windows. Some services are essential to the operation of your PC and should not be disabled unless you know what you’re doing. Others may be unnecessary for your use case and can be disabled safely. To see which of your services fall into each category, open up the Services applet in Control Panel (or just type “services” into Cortana). You’ll see a list of all current running services with an option to pause or stop any one service if it’s causing problems—just click on the link on its row:

  • Important Service Checklist
  • Automatic Updates
  • Diagnostic Policy Service
  • DNS Client
  • Distributed Link Tracking Client
  • HID Input Service (Human Interface Device Input)
  • Internet Connection Sharing (ICS)

Set up an automatic maintenance schedule with a third party software solution

Setting up an automatic maintenance schedule is an important step in keeping your servers running smoothly. Microsoft recommends that you run regular scans, updates, and backups to maintain the health of your server. While it is simple enough to launch these tasks manually when you need to, doing so regularly can be time-consuming and error-prone. A third party solution will help keep track of all your scheduled tasks in one place so that they can be executed automatically at specific times or intervals by a dedicated computer program.

Linux Server Maintenance Checklist

Issues to have in your server maintenance checklist for Linux servers.

  • Update Control Panel
  • Clean Hardware and Determine if Needs Updating
  • Check for Application Updates
  • Update OS
  • Check System Security
  • Add Hotfixes, Service Packs, Etc
  • Remote Scan for Vulnerabilities
  • Test Backup Archive Integrity
  • Verify Backups
  • Check Remote Management Tools
  • Check Server Utilization
  • Hardware Errors Checked
  • Monitor RAID Alarms
  • Check Disk Usage
  • Review User Accounts
  • Change Passwords
  • Test Recoveries

Server Room Maintenance Checklist 

Maintaining the physical room where servers exist is as important as maintaining the server. Here are some steps you might find useful if you are maintaining a server room.

  • Cooling systems
  • Electrical maintenance
  • Cleaning the room, floors, etc. 
  • Non-interruptible power supply, including backing up batteries
  • Detection of water or moisture
  • Inspection of all cables (power and data)
  • Ensuring fire suppression systems work, including sprinkler systems in case of fire
  • Rack-based equipment used for air handling
  • Free standing equipment
  • Humidifier system
  • Chillers
  • Scanning of infrared power connectivity
  • Additional Server Maintenance Checklist Considerations

Server Maintenance Tips And Best Practices

Daily, Weekly, Monthly and more server checks

To properly maintain servers, you need different checks made periodically. For example, you probably don’t need to check your backups every day. But if you wait more than a week to check on them, problems could arise and you wouldn’t be ready for them. You need different checklists for every kind of check you want to perform on your servers. So you should have a daily, weekly, and monthly checklist. We’ve seen cases where network administrators made even quarterly or annual ones. With Manifestly you can schedule daily, weekly, or quarterly workflow runs to ensure recurring tasks are completed on time. You’ll also get notifications on run activity, such as when a run has started.

Verify your backups are working

Be sure that your backups are working before making any changes to your production system. You may even want to run some test recoveries if you are going to delete critical data. Whilst you should already have automatic system backups scheduled regularly, these efforts are in vain if you haven’t even tested if the backups are doing what they’re supposed to be doing. Even checking that you have the correct server location is something to keep in mind.

Check application updates

Web applications account for more than 95% of all security breaches happening in the world. Ensuring you’re using the most recent version guarantees that any problem they’ve corrected is no longer an issue for you. Remember to perform a complete backup before updating, just in case something breaks. With our Zapier integration, you can automate runs to start whenever a new update rolls out to your applications.

Check disk usage

Keep your production system clean, they’re not an archival system. Delete old logs, emails, and software versions no longer used. Keeping your system free of old software limits the security issues that can appear. The less data you have, the faster it’ll be to recover said data. Don’t let it exceed 90% of its disk capacity. Either reduce usage or add more storage. A big problem for your servers is that if any partition reaches 100%, your server may stop responding, database tables can corrupt and data may be lost.

Check server utilization

Review your server’s disk, CPU, RAM, and network utilization. Be proactive if they are nearing their limits. You may need to plan on adding resources to your server or migrating to a new one. With the help of most monitor tools, you can set them to send you a notification when any usage reaches a certain threshold. Therefore, this will trigger a run for your team.

Update Your OS

Linux systems release updates frequently. So, it’s hard to keep track of all of them. This is why you should use automated patch management tools. Also, have monitoring in place to alert you when a system is out of date. If you are not updating your server or even updating them manually, you may miss important security updates. As a result, your servers will be at risk. Hackers often scan for vulnerable systems within hours of an issue being disclosed. So rapid response is the key to safety.

If you cannot automate your updates, then create a schedule to update your system. We believe weekly checks work best, but for older OS versions you can do them monthly. You need to monitor release notices from your distribution so you are aware of any major security threats. To help you, you can set a run to start every time a new update comes live.

Change passwords

Changing passwords on a regular basis reduces the danger of live passwords falling into the hands of a hacker. You should change passwords every 3 to 6 months. But if you have given out passwords to others for any reason, consider changing them after the people you gave it to are done with their work. With our Departments & Locations feature you can set this activity private so nobody that doesn’t need to, sees when the passwords are changed. Therefore, your servers will be safer.

Update your Control Panel

Control panel software (such as cPanel) require manual updates. When updating cPanel, only the control panel is updated. You still need to update the applications that it manages. As an example, if you are using WHM/cPanel, you must manually update PHP versions to fix resolved issues.

RAID Alarms

You should be monitoring your RAID status. Just a single disk failure can cause a complete system failure. Even if data says that roughly 1% of servers per year present disk failure, a complete system failure will turn a simple drive replacement into a disaster recovery scenario. That’s not something you want to deal with. As with the application updates, you can automate runs to start whenever a RAID alarm goes off.

Check remote management tools

If your server is co-located or with a dedicated server provider, you will want to check that your remote management tools work. Remote console, remote reboot, and rescue mode are called the 3 essential tools for remote server management. You need to make sure that these will work in case you need them.

Check for hardware errors

Hardware problems are not common but create a big issue. So you need to review the log for any hardware problems like disk read error, network failure, overheating notices. Even if these problems are rare, you don’t want to risk your server because you weren’t cautious enough.

Check cable integrity

Cable wear and tear is a big factor that is often forgotten when determining points of system failures. It’s a tedious task, but it needs to be included in the routine maintenance process for all wire-dependant hardware.

Review user accounts

Remove any user that’s no longer relevant. Staff changes, client cancellations, or any other user changes apply. That data is not just a security risk. It also can present legal problems. Depending on your service contracts, you may not have the right to retain a client’s data after they have ended services.

Be Smart About Scheduled Maintenance

Never schedule maintenance for a time when you won’t be able to get help. Always think ahead to avoid any national holidays or weekends. Instead, plan to execute during the week, when you can fall back on expert help if something goes wrong.

Check system security

We suggest a periodic review of your server’s security using a remote auditing tool such as Nessus. Regular security audits serve as a check on system configuration, OS updates, and other potential security risks. We recommend you do this monthly. The minimum we believe is safe is at least 4 times a year.

Perform a server malware scan

It should be part of your routine process to run a malware check on your server machines. ClamAV is a useful tool for scanning against known databases of viruses and malware for Linux machines.

Traffic control

Typical business networks run on TCP/IP. An incorrect TCP/IP setting results in address and routing problems. Always ensure that your server’s TCP/IP settings are correct.

How Manifestly Checklists Helps with Server Maintenance

Role Based Assignments so everyone knows their tasks and due dates

With our Role Based Assignments, you can have the workflow made for server maintenance be automatically assigned to the team or person responsible. Notifications via email, slack, or MS Teams are sent with late notifications sent for steps or runs that are late. Everyone will know what is to be done and who is responsible.

Continuous Improvement is Built-in to our Software

With our platform, you can improve and update your workflows easily. With the help of our Process Improvement through Feedback feature, your team can submit ideas and feedback after completing every run. This feedback is consolidated in the reporting section of the workflows for easy access and will help you leverage it for continuous workflow and process standardization.

Automatically update other systems using our web hooks and API

Our web hooks feature let’s you automatically update any system on the web with data from the workflow run. You can create custom JSON payloads with content tokens to fully customize the JSON sent to any URL.

With our open API, you can develop any integration you need. Keep in mind we already have 1000+ through Zapier so check those out first.

Table of Contents

Get a handle on your important recurring checklists.

With Manifestly, your team will Never Miss a Thing.