TopCoder problem "SortBooks" used in SRM 262 (Division I Level One , Division II Level Two)



Problem Statement

    You are an avid book collector, and you have dutifully catalogued your books on your computer. You had just entered your last book when you discovered that sometimes you put the author in the first field and the title in the second field, and sometimes you put the title in the first field and the author in the second field. You decide to write a program to sort out the list such that the author comes first, then the title of the book.



To sort your list, you decide to mark a field as the title if it satisfies at least one of the following criteria:



- It contains at least one of the following words: "the", "and", or "of".

- It contains more than 3 space-delimited words.



You figure that following these rules will sort most of your books, but you need a routine to check the rest of them manually so you can sort them yourself.



You are given a String[] field1, and a String[] field2. Corresponding elements of field1 and field2 refer to the two catalogued fields of a single book. You are to return a int[] containing the 0-based indexes of the books in ascending order that need to be checked manually. A book must be checked manually if either both of its fields are titles or neither of its fields are titles according to the rules.
 

Definition

    
Class:SortBooks
Method:checkManually
Parameters:String[], String[]
Returns:int[]
Method signature:int[] checkManually(String[] field1, String[] field2)
(be sure your method is public)
    
 

Notes

-When checking for "the", "and", and "of", note that they are case insensitive.

Thus "the" = "tHE" = "THE", etc.
 

Constraints

-field1 will contain between 1 and 50 elements, inclusive.
-field2 will contain an equal number of elements as field1.
-Each element in field1 and field2 will contain between 1 and 50 characters, inclusive.
-field1 and field2 will consist only of letters ('a'-'z', 'A'-'Z') and spaces.
-field1 and field2 will have no leading or trailing spaces.
 

Examples

0)
    
{ "J R R Tolkien", "THE Jungle BOOK" }
{ "THE HOBBIT", "RUDYARD KIPLING" }
Returns: {0 }
Both "J R R Tolkien" and "THE HOBBIT" are considered titles because "J R R Tolkien" is 4 words and "THE HOBBIT" contains "THE". Therefore it needs to be checked manually.
1)
    
{ "Scaramouche", "Dan Brown", "War and Peace" }	
{ "Rafael Sabatini", "The Da Vinci Code", "Leo Tolstoy" }
Returns: {0 }
The first book needs to be checked, because there is not enough information to tell which one is the title.
2)
    
{ "Aesop", "Little Women", "Hans Christian Anderson", "The Arabian Nights", 
  "Peter Christian Asbornsen", "Mr Poppers Penguins", "Enid Bagnold", "Miss Hickory",
  "Sir James Barrie", "The Wizard of OZ", "Ludwig Bemelmans", "The Five Chinese Brothers",
  "Edith Nesbit Bland", "The Enchanted Castle", "Edith Nesbit Bland",
  "Five Children and It", "Michael Bond", "The Children of Green Knowe",
  "James Boyd", "Caddie Woodlawn", "Walter Brooks", "The Runaway Bunny",
  "Margaret Wise Brown", "Big Red Barn", "Jean De Brunhoff",
  "Old Mother West Wind", "Frances Hodgson Burnett", "A Little Princess",
  "Frances Hodgson Burnett", "Mike Mulligan and His Steam Shovel",
  "Virginia Lee Burton", "The Enormous Egg", "Eleanor Cameron",
  "The Happy Orpheline", "Natalie Savage Carlson", "Through the Looking Glass",
  "Miguel Cervantes", "Secret of the Andes", "Beverly Cleary", "Henry Huggins",
  "Elizabeth Coatsworth", "The Adventures of Pinocchio", "Barbara Cooney",
  "The Little Lame Prince", "Paul Creswick", "The Courage of Sarah Noble",
  "Alice Dagliesh" }
{ "Aesops Fables", "Louisa May Alcott", "Fairy Tales", "Hans Christian Anderson",
  "East of the Sun and West of the Moon", "Richard and Florence Atwater",
  "National Velvet", "Carolyn Bailey", "Peter Pan", "Frank L Baum", "Madeline",
  "Claire Huchet Bishop", "The Railway Children", "Edith Nesbit Bland",
  "The Story of the Treasure Seekers", "Edith Nesbit Bland", "A Bear Called Paddington",
  "Lucy Boston", "Drums", "Carol Rylie Brink", "Freddy the Detective",
  "Margaret Wise Brown", "The Little Fur Family", "Moon Goodnight", "The Story of Babar",
  "Thornton W Burgess", "Little Lord Fauntleroy", "Frances Hodgson Burnett",
  "The Secret Garden", "Virginia Lee Burton", "The Little House", "Oliver Butterworth",
  "The Wonderful Flight to the Mushroom Planet", "Natalie Savage Carlson",
  "The Family Under the Bridge", "Lewis Carroll", "Don Quixote", "Ann Nolan Clark",
  "Beezus and Ramona", "Beverly Cleary", "The Cat Who Went to Heaven", "Carlo Collodi",
  "Chanticleer and the Fox", "Dinah Mulock Craik", "Robin Hood", "Alice Dagliesh",
  "The Bears on Hemlock Mountain" }
Returns: {0, 1, 2, 6, 7, 8, 10, 18, 19, 23, 26, 27, 36, 39, 44 }
3)
    
{ "Lost     Horizon" }
{ "James Hilton" }
Returns: {0 }
Words in titles and/or authors may be separated by more than one space.
4)
    
{ "andy rooney", "joe lofthouse", "Theodore Taylor" }
{ "love of life", "the arrest", "Softly Wandering" }
Returns: {2 }

Problem url:

http://www.topcoder.com/stat?c=problem_statement&pm=4557

Problem stats url:

http://www.topcoder.com/tc?module=ProblemDetail&rd=7996&pm=4557

Writer:

Softwalker

Testers:

PabloGilberto , brett1479 , Olexiy

Problem categories:

Search, String Parsing