Ben Motz, a lecturer in the Department of Psychological and Brain Sciences, created his Mypage site as a way to share his research. However, because of University Information Technology Services’ “no-robots” policy on the Mypage web server at IU, the information Motz posts on his personal IU website cannot be searched by commercial search engines.
“I was surprised and it made me wonder why I was putting any information on my page in the first place,” he said. “For me, the goal of having a website wasn’t just to have an auxiliary Facebook page or just a picture of myself that people could get to from the faculty directory. It was for if I had an idea, a program that I had written or a statistical analysis that I came up with.”
UITS uses the Robots Exclusion Protocol, which allows website owners to provide instructions to web robots to avoid excessive resource consumption and exclude pages from listing in search indexes, said Craig Spanburg, manager of the UITS Enterprise Web Tech Services in an email.
“Since its beginnings as php.indiana.edu and personal home pages, IU’s Mypage personal web page service has been a frequent target of robots, which overwhelm the service,” he said. “To ensure reliable access to the 20,000 student, faculty and staff Mypage sites, IU put in place a Robots Exclusion Protocol. The Mypage protocol restricts crawling from outside the University, protecting your personal content such as homework from indexing.”
In addition to finding the page through the department’s faculty directory, the Mypage site comes up on Google but none of the content is visible, Motz said.
Motz programmed a system called the Lateralizer, a stimulus presentation and response recording in a divided visual field that enables students to investigate the theories of asymmetries between the two cerebral hemispheres. However, the information on this program doesn’t directly come up on commercial search engines, Motz said.
“If I wanted to post this program on my page, I wouldn’t expect anyone to find it through me,” he said. “I would expect someone to say, ‘I would like to do a divided visual field experiment,’ but this wouldn’t come up. There is nothing I could do to make it come up.”
Jerome Busemeyer, professor in the Department of Psychological and Brain Sciences, has a similar problem. His research on quantum cognition isn’t indexed unless someone types in his name along with it, he said.
“It is more difficult for people to find my research and works,” he said. “You really have to search to find these things.”
Filippo Menczer, associate professor of informatics and computer science, said search engines use crawler programs to visit websites in order to find the content of sites so when someone submits a query to the search engine, it will know what pages contain those key words.
“When a webmaster, an administrator of a website, doesn’t want a particular crawler or any crawler to crawl their particular website or particular pages on that website, there is this robot exclusion standard which is sort of an informal standard that people adhere to to tell the crawler, ‘Look, please don’t go into these webpages.’”
Menczer said for mypage.iu.edu, the University has a file that tells all crawlers except the IU crawler to not index any pages there. The Mypage site would come up on an IU search engine but not third party search engines.
“I don’t think it is a big problem,” he said. “I am not saying it is the right thing to do. I would assume they thought about it so they have good reasons. If that is true, I don’t think it is a big problem because you can get around it in many ways.”
By putting a meta tag in the header of the HTML file, it would tell the robot whether to index the particular page, Menczer said.
“That one overrules the general rule that indexes the entire website,” he said.
While the content of the website may not be searchable, none of the information is blocked once someone is on the site, Menczer said.
“The website is still open,” he said. “It is world accessible whether a search engine indexes it or not. It is very nice of IU to provide a free server to anyone who wants a page and to let you put anything you want on it but it is by far not the only way you can do it. There are so many options that there isn’t any shortage of ways for people to have home pages.”
Busemeyer said he wasn’t aware of this “no-robots” policy.
“I was surprised,” he said. “I need to now know how to fix this problem. Perhaps I will put the information elsewhere. I need to figure out how to fix it.”
UITS policy blocks faculty websites
Get stories like this in your inbox
Subscribe



