Hi all!
I have a pretty convoluted character string that I'm trying to convert into a data frame. Here's an example of what the string looks like:
> string <- "A. MYTHOLOGICAL MOTIFS a. Creator A0. - A87. + . (2) 1, 2; (4) 4. b. Gods A101. - A493. (1) 19, 46; (2) 1, 2. B. ANIMALS a. Mythical animals B0. - B98. + . (1) 62, 63; (23) 6, 7. b. Magic animals B101. + . - B192.2. + . (1) 24; (7) 5, 33."
I want to make a row for each number that follows a number in parentheses (so, starting from the beginning, those would 1, 2, 3, 4, 5, 19, 46, etc.) (in the original dataset, these are story IDs). You can think about those hierarchically organized under three levels:
- a number in parentheses (the collection in which a story is found)
- a motif identified with a lowercase letter (e.g., "a. Creator A0. - A87. + ." or "b. Gods A101. - A493.")
- a category of motifs (e.g., "A. MYTHOLOGICAL MOTIFS")
I want the fourth column to contain each story ID; the third column to contain the number in the parentheses immediately preceding that number (the collection); the second column to include the name or an identifier for the motif (the names labeled with the lowercase letters); and the fourth column to contain the class of motifs (the names labeled with the uppercase letters).
So, for the above string, the data frame would come out as follows:
> dat <- data.frame("Motif_category" = c(rep("A. MYTHOLOGICAL MOTIFS", 7), rep("B. ANIMALS", 7)),
+ "Motif" = c(rep("a. Creator A0. - A87. + .", 3), rep("b. Gods A101. - A493.", 4),
+ rep("a. Mythical animals B0. - B908. + .", 4), rep("b. Magic animals B101. + . - B192.2. + .", 3)),
+ "Collection" = c(2, 2, 4, 1, 1, 2, 2, 1, 1, 23, 23, 1, 7, 7),
+ "Story_no" = c(1, 2, 4, 19, 46, 1, 2, 62, 63, 6, 7, 24, 5, 33))
> dat
Motif_category Motif Collection Story_no
1 A. MYTHOLOGICAL MOTIFS a. Creator A0. - A87. + . 2 1
2 A. MYTHOLOGICAL MOTIFS a. Creator A0. - A87. + . 2 2
3 A. MYTHOLOGICAL MOTIFS a. Creator A0. - A87. + . 4 4
4 A. MYTHOLOGICAL MOTIFS b. Gods A101. - A493. 1 19
5 A. MYTHOLOGICAL MOTIFS b. Gods A101. - A493. 1 46
6 A. MYTHOLOGICAL MOTIFS b. Gods A101. - A493. 2 1
7 A. MYTHOLOGICAL MOTIFS b. Gods A101. - A493. 2 2
8 B. ANIMALS a. Mythical animals B0. - B908. + . 1 62
9 B. ANIMALS a. Mythical animals B0. - B908. + . 1 63
10 B. ANIMALS a. Mythical animals B0. - B908. + . 23 6
11 B. ANIMALS a. Mythical animals B0. - B908. + . 23 7
12 B. ANIMALS b. Magic animals B101. + . - B192.2. + . 1 24
13 B. ANIMALS b. Magic animals B101. + . - B192.2. + . 7 5
14 B. ANIMALS b. Magic animals B101. + . - B192.2. + . 7 33
If anyone has any thoughts, that would be greatly appreciated!