Mohave Community College Does Disaster Recovery on a Shoestring
Arizona's Mohave Community College lacks a big student body, big budget, or big IT staff. But it's got a big, sprawling area to cover -- to get from the main campus in Kingman, Arizona, to the furthest campus in Colorado City, requires a four-hour desert drive through three states.
And six months into taking the job as CIO, Mark Van Pelt faced a big crisis: The entire domain went down for 72 hours, nearly shutting down the college. He knew that couldn't happen again, and needed to overhaul the college's disaster recovery (DR) plan, on a community college's shoestring budget.
"Seventy-two hours isn't good enough for a college. We have students who have to turn in homework and do labs online," Van Pelt told Light Reading in an interview at the VMworld customer and partner conference in Las Vegas late last month. "We have working parents and adult students who have to hit the infrastructure remotely."
The outage also took a personal toll on Van Pelt. "Three sleepless nights. Nobody likes that," he says. "I used to do that easier when I was 18. I can't do that anymore."
Mohave faced several constraints in updating its DR plan. In addition to the small budget, geographic sprawl, and small IT staff -- just 16 people -- Van Pelt and his team faced a bandwidth constraint. Fiber is unavailable in Mojave County; much of the connectivity comes from low-speed microwave connections.
On the plus side, the campus computing environment is 100% virtualized, other than backup domain controllers. Remote administration is a major reason for the virtualization. Mohave is the fifth largest county in the US by area. The four-hour drive from the main campus in the county seat of Kingman, to Colorado City, goes through Las Vegas, Utah, and Arizona. The nearest campus outside Kingman is an hour away, and Kingman has a remote campus located inside its city limits.
"With that much distance between our campuses, trying to maintain something in, say, Colorado City, manually, would be nightmarish for us," Van Pelt said.
The community college system has five campuses in all, with about 5,000 students.
Van Pelt took over as CIO about two and a half years ago, and the domain went down six months later. DR worked, but required three days: "In 2017, we got a DR test and discovered that 72 hours was the best we can do," Van Pelt said.
Running a secondary server was not an option, as older legacy apps are bound to a particular IP address.
Because Mohave uses VMware for virtualization, VMware NSX was the first place it looked for DR -- and the last too, as it implemented a plan using NSX, VMware's network virtualization and security platform.
"We tend to float toward VMware products first," Josh Walters, Mohave's senior VMware engineer, said. "We support VMware because we haven't had a problem with it."
Mohave used NSX to set up a "ghost network" connecting a backup data center in Kingman, which keeps all 24TB of Mohave's data updated via a gigabit connection. "It's a properly IP'd network that can't be seen by anything else," Van Pelt says.
Setting up the DR architecture was challenging. The campus network needed to be reconfigured properly. The college's network engineer is self-taught, though VMware-certified, and inherited a network from a series of predecessors. "We're a small shop," Van Pelt says. "A lot of times, when you have a small shop, you have configurations that are good enough but not optimized." That had to be corrected.
When the DR architecture was done, Mohave did a DR test in October, breaking the college system's link to Las Vegas to simulate an outage. Recovery time went from 72 hours to 48 minutes.
"When we did it we didn't believe it. So we failed it back over and did it again," Van Pelt said. "It worked for us."
He added, "For a tiny community college, being able to have that kind of resiliency and response is something you don't find."
— Mitch Wagner Executive Editor, Light Reading