Osaka-ben in anime movies?

Introduction

According to research about the popularity of anime in Japan, about 32% to 37% of the entire population frequently interacts with this form of media, our hypothesis was that usage of Osaka-ben by prominent characters in anime movies could influence youth\'s spoken language.

The paper is mainly composed of three parts. First, we elaborate on the historical background of Osaka-ben, in which we explain nuances within the definitions of 'dialects' in Japan and its influences. We then also include background exposition on the term 'wakamono kotoba' and delve deeper into the emergence of it and cover some previous studies.

And finally our quantitative research. Due to the rise of usage of wakamono kotoba on social media platforms of which a vast variety seems to stem from kansai-dialects, we decided to take a closer look at the possible links between the two and tried to track the occurrence of Osaka dialect in 4 of the highest ranking anime movies in Japan by text analyzing their scripts, using 'Voyant Tools', 'Excel' and 'Python'. This final part will contain our quantitative research as well as explain our general research design, methods and conclusions in detail.

I. Background of 'Osaka-ben' (大阪弁)

Background on dialects in Japan:

Japan covers an area of 377.973 km², counts thousands of islands of which approximately 400 are inhabited and includes many wide mountain ranges with deep valleys. It is for that reason that Japan has an enormous variety of dialects that differ per region or in this case per every one of its islands and its 47 prefectures. [The dialects spoken in Japan differ from each other to the point that dialects may be completely incoherent to fellow Japanese citizens from a different region.

However, the starkest contrast between dialects present in Japan are those spoken in the Kansai Province and the Kanto Province, even though they are mutually intelligible. Japanese that is spoken in Kansai used to be the standard, which changed after the moving of the capital of Japan to 'Edo', current Tokyo. Kansai is currently the most known and used dialect. Japanese from Kanto is called 'Tokyo dialect' and also currently referred to as 'standard Japanese'.

Standard Japanese was developed in the post-war period of WWII where 'kana', one of Japan's writing systems, were reinvented to 'gendai kanazukai' or 'modern kana usage' and used to form the base of a more streamlined, standardized version of Japanese.

The largest city located in the Kansai region is Osaka. This research will mainly focus on the Osaka dialect because of its popularity as it was the most media-exposed dialect from Japan.

Kansai dialect, Kinki dialect, Kamigata-go and Osaka-ben

Kansai dialect is an umbrella term, used to describe the dialects from the Kansai Province (Kinki region) which includes prominent dialects from 'keihanshin' like Kyoto, Kobe and Osaka. It is in 'technical terms' also known as Kinki dialect.

The origin of Kinki comes from late 7th century Japan when provinces were created through the Ritsuryo law system and meant "proximity to the capital", which was in Kyoto at the time. Kansai dialect has a history that goes back thousands of years when the imperial court was seated in Kyoto, meaning that Kansai dialect was the 'standard', leaving behind traces in the current standard Japanese.

Kamigata-go dates back to the Edo period and is used to describe dialects from Kyoto and Osaka. The political center was moved to Edo, which is nowadays Tokyo, while the imperial court stayed in Kyoto. This meant that the speech from Kyoto was used by people of high regard, therefore becoming kamigata-go, 上方語. Consequently, forms of 'keigo' are heavily based on these dialects.

Even though the 'Kansai dialect' is used as the term for the dialects spoken in that region, every prefecture has their own distinguishing differences. Osaka-ben stands out in particular.

One of the most notable features of Osaka-ben is its noticeable intonation, which is characterized by a rising and falling rhythm that is often described as \"musical.\" The dialect is seen as lively and humorous since they use creative expressions to convey their thoughts. One of the most noticeable features of Osaka-ben is the use of slang and informal speech, we will elaborate further on this later.

Osaka-ben also has a few words and phrases that are unique to the dialect. For example, the word "Nambo\'\' means "how much?". Since Osaka has Japan's trading centers, this question was often asked. Eventually the word became its own unique expression that differentiates completely from the Japanese standard.

Grammatically, Osaka-ben also has some noticeable features. For example, while in Tokyo they would end a sentence with "yo" in Osaka they would end it with "de". Furthermore, instead of ending with the negative suffix "nai" they use "hen".

These are a few examples to show the different colors of Osaka-ben so that we can now distinguish this unique dialect from the other Kansai dialects.

Influence of Osaka-ben

Since Osaka-ben is so lively it has a very huge impact on modern life in Japan. For the Osaka prefecture itself, it became a symbol of Osaka's independence and strong spirit, for the locals it's a big part of their identity.

When Japan's economy began to bloom a lot of entertainment industries also made their rise and Osaka-ben has been used in the media a lot, most of all by comedians because of the dialects' lively nature. Before Osaka-ben was mostly used for the stereotypical villain, but thanks to the use of it by comedians, the dialect got a more positive look, becoming 'owarai kotoba'. Since Osaka-ben has been used in the media a lot, young people gained interest in this dialect.

Young people and teenagers have developed their own way of speech by mixing different dialects and slangs. This is known as 'wakamono kotoba' (youth language). As wakamono kotoba is heavily influenced by media such as TV programs, magazines and anime, there are variations of wakamono kotoba throughout Japan based on the dialect used in a specific region. Kansai dialect, the most prestigious dialect, is more widespread and thus able to influence wakamono kotoba in not only the Kansai region, but across Japan. The other way around the same argument could be made; the use of Osaka-ben by the youth may have impacted newer popular media like anime and dramas as it is a reciprocal occurrence.

II. Background of 'Wakamono kotoba' (若者言葉)

As mentioned earlier, a language variety other than dialect is youth language, in Japanese this is called wakamono kotoba. Youth language is an ever changing language phenomenon that is linked to the speaker and the time and place. In other words it is a speech variation used by young people to communicate in a specific social group. Within the overarching group of wakamono kotoba you have many different, smaller subcategories linked to specific interests of each group of speakers and the medium of communication.

Even though wakamono kotoba is always changing and words are often quickly abandoned after being created, there are some constant word formation techniques which can be distinguished. That is to say techniques like affixation, compounding, clipping/abbreviations, reduplication and borrowing. This last one is quite interesting for our research. Borrowing means that words are taken from a foreign language and used in another one sometimes with a slightly adapted, narrowed meaning. This technique of borrowing could be applied as well to the usage of dialect in wakamono kotoba.

Considering wakamono kotoba evolves in congruence to its time, it is to no one's surprise that this youth language is actively used on social media platforms nowadays. In Japan some of the most used platforms are LINE, Facebook, Skype and Twitter. Furthermore, wakamono kotoba also appears in popular media like movies, anime and comics.

Knowing all this it could be interesting to look at the possible relation between the occurrence of dialects in anime and occurrence of dialect words in wakamono kotoba. Could it be possible that the amount of usage of dialect in a popular media like anime, consumed and discussed online by many youths, is enough to influence their use of speech? And thus if dialect is that present in these media that these regional speech patterns start to occur in the youth language?

III. Quantitative research through text analyzing

General research design

For this research we analyzed the scripts of the highest ranking anime movies from the time around March and April of 2023. This, of course, has already changed since new movies can make their way up the list. The 4 movies we've chosen are: Demon Slayers, Spirited Away, Your Name and Princess Mononoke; in descending order. By means of different programs, such as Voyant Tools, Excel and Python which are the three main programs that we have used, we would analyze the script to find the frequency of the use of dialect and 'wakamono kotoba'. For this we composed a list of frequently used words and abbreviations in Osaka-ben and the youth language.

Selected words (大阪弁) and their meaning:


1. めっちゃ	とても (very)
2. なんでやねん	どうしてそうなるんですか (why is that so)
3. まけて	安くして (make it cheap, bargaining)
4. ほな	では・じゃ (well then..)
5. おおきに	ありがとう (thanks)
6. アカン	ダメ (no good, bad)
7. ホンマ	本当 (really)
8. アホ	バカ (idiot)
9. はよ	早い (quick, quickly)
10. おもろい	面白い (interesting)
11. おもんない	面白くない (not interesting)
12. ちゃう	違う (wrong, different)
13. いてる	いる (to be)
14. おる	いる (to be)
15. あんた	あなた (you)

Voyant Tools

A tool beneficial for text analyzing is Voyant Tools. This application allows you to upload one or several textual documents to form a corpus with which you can work. Some things you can do with Voyant Tools are: generate a cirrus (or wordcloud) of the most occurring words, receive information about the context of certain words or look for the occurrence of certain terms. It is mainly this last one that will be central in our personal research. The goal is to analyze the Japanese movie scripts by entering a set of terms and look at their patterns of occurrence and frequency throughout the text.

Once you upload a Japanese text into Voyant Tools it is, however, immediately evident that there is a major problem. That is to say, Voyant Tools and the Japanese language aren't compatible. The program doesn't properly recognize Japanese words. It often names single characters (both kanji and hiragana) as a word when it is in reality just a part of a word. After some research on this problem we tried to fix this problem by creating a detailed list of stopwords to filter out any meaningless bits.

The words selected for the list come from a blog article by Digitalnagasaki, the character names mentioned in the script and a self-defined list of random sounds the program kept recognizing as words.

Nonetheless the same problem continued to occur after the application of the stopwords. Single characters were still recognized as a full word. For example the names entered in the stopwords list would not be filtered out because the algorithm saw the characters of the names in the corpus as independent, while in the stopwords list it saw it as compound kanji words.

This remaining problem makes it very difficult to work with Voyant Tools when analyzing a Japanese text. Even when you ignore all the special tools that give odd results because of this, it is also impractical when just looking for hits of a certain word. That is because some hits are actually only a part of a word and therefore incorrect.

Another difficulty we faced is the fact that movie scripts are written down in spoken language. As is known, spoken language can differ massively from the normative language. This aspect made it extra difficult for the program to recognize words properly.

It is for these reasons that we didn't opt for the use of Voyant Tools to execute our final research. If these bugs could be solved it would be a very interesting tool giving many different (visual) results, so further research on the use of Voyant Tools in combination with Japanese texts, specifically manifestations of spoken language, would certainly be interesting.

Excel

Excel is a great tool to visualize information and offers a wide range of features that can be used for different means, which is why we wanted to try using it for our research, specifically by text analyzing. We wanted to see if it would be possible to use excel in a way that would make it clear which words were used, and eventually with visual representation. We mostly used it to organize our scripts and count the frequency of words.

We started by getting our scripts in excel files. We had two types of files for our scripts being google docs files and a website. Getting the data off the website and into an excel file was fairly easy, we just copy-pasted it onto excel and didn't have any formatting to do to get the script correctly into columns and rows.

It took some extra steps to get the scripts from the docs files into excel format. First we downloaded the google docs file as an html file and unzipped it. Afterwards we opened that same file in google sheets. And finally we could download the google sheet file as an excel file.

From there on, we had all our scripts in one place which was easier to actually do our research which also met some troubles. At first we wanted to use the COUNTIF function, but it turned out excel also has some problems recognizing Japanese words because the only answer we got from this function was 0. While this could have been a possible outcome if there really wasn't any match for the word we searched, we made sure to look for words that were definitely in the script so our first attempt failed.

We also tried the FIND and FIND.B function. The FIND.B function is especially made for Japanese, Korean and Chinese so this was particularly handy. But what we quickly found out is that this function only works for one column, it would not allow us to put multiple columns under that function. If we would have used this we would need to use this function for column after column and not all at once. So this function works for Japanese but only if you need to find a certain amount of a word for only one column.

We then moved on to the find function of excel, in the editing part of the home tab. This gives you the amount of times a particular word is used, but it also gives you the word's location in the file. This particular function was proven very useful to filter the outcome of our search. Excel doesn't always give only the exact words you were looking for. So when looking for a match you could get some words which contain the word you searched for. For example when looking for the word はよ excel not only gives you that word but also other words containing it like おはよう. So in the end we filtered those words out manually, which was made easier because word locations and sentences are also given.

Python

Python is a programming language that is now widely used by most programmers. We consulted the geniuses of reddit users in hopes they could inform us about more efficient ways to analyze text, and then more specifically Japanese texts, because we faced difficulties with Japanese in the already existing text analyzing tools. "The best option is using Python", was the response we received.

With the help of posts about similar research and guided by their work, we successfully constructed a Python code that fits the purpose of our own research. We first very naively tried to write code in the online coding platform 'repl.it', but that turned out to be not specialized enough for the methods we initially wanted to use. That is why we then switched to the software of Pycharm. We used this alongside a coding environment in a private server to solve specific software problems we had at the start.

After all software inconveniences were out of the way, we got to the part of actually constructing the code. By taking a closer look at the code example and through the process of trial-and-error, we had to realize that this given code didn't really fit the purpose of our research and that the type of code we actually needed had quite the simple structure. After several hours of searching for the most efficient constructions (that we're capable of) to build our code and in that process testing the separate parts in the repl.it tool (this to make it easier to find and deal with potential errors), the final code was finished. The website 'W3Schools.com' proved to be very useful to look for functions etc. we could use.

The initial ambition was to have a code that could accurately search our scripts for the selected words, count how many occurrences there are of each one and then give those data as output, along with some way of locating the matches in the body of the text. Mostly this last factor seemed to be challenging. First, we wanted to use regular expressions in Python to filter out matches of words that were in reality not accurate. For example, if you look for いてる, the code could also count that word when that specific combination of alphabets is merely part of a verb and those are occurrences that we obviously don't want to count. That's why we first wanted to incorporate regular expressions in the Python code to filter out those wrong matches. However, combining the regular expressions with Python on the one hand and, even more, with the Japanese language on the other hand, turned out to be a little bit too advanced.

The code itself is constructed out of two important parts. The first part starts off with asking to input the script. Next, we made a list of variables to work with afterwards. In that list, we made a separate variable out every one of the fifteen words we selected and then split those up again in all the possible writing ways we could find, like kanji, hiragana, katakana. As a result, we have a total of 44 variables, with each variable consisting of a letter and a number. The variables with the same letter are variations of one word and the numbers differentiate between those types of characters. For example, variables a_1, a_2, a_3 and b_1 respectively stand for "めっちゃ", "メッチャ", "滅茶" and "なんでやねん".

The second part of the code exists of the same structure that is reused each time to work with a different set of variables. To take the set of a-variables as example, the first line uses an "if"-clause to search for any occurrence of the word めっちゃ in the given script. After that, the occurrences of each spelling are counted and added up. The program then outputs a sentence telling the user which word (for example the selected word with variable set c) was searched for and how many times it was found. Lastly, we wrote for each variable set a couple of lines that search for every separate match of a word in the text and output them. That way we know of each searched word in which forms it appears so we can verify if the match is accurate with another tool, such as Word, by using the find function. The code is concluded with a responding "else"-clause, which gives a simple sentence as output, stating that there were no matches found for the searched word. This to organize the output a little bit more and prevent that we would get a lot of empty lines when there are no matches found for some words.

To finish off our Python adventure, there is one thing left to report. After the code was completed, we of course needed to test it on our actual scripts. Since we ran all the previous tests with parts of the code on shorter, simpler examples, we did have the confirmation that the code shouldn't give any errors anymore, but we ran into a minor problem nonetheless when pasting our first script in the input window. Since we already had data for our research from using other tools, we had an idea of the results we should get. These, however, didn't match with the output of Python. Python gave us a lot less matches than we were supposed to get. After a bit of investigation, we found out that it was necessary to paste the text of our scripts in one single line of text. When you type an input line yourself, you cannot use the return key of your keyboard, because it only works as an 'enter' to run the code. That is the reason the program is only able to read the first line of the input. So if we want to have a whole script read in and that is written in multiple lines, we first have to delete every tab character. So we did a quick operation in Komodo edit: we put each separate script into the editor and with the 'replace' function, we deleted all the tabs in the text by using regular expressions and replacing '\r' or '\n' by nothing. In the following picture, you see the final result in the output of, in the same order, the Demon Slayer: Mugen Train script, the Spirited Away script, the Your Name script and the script of Princess Mononoke. The green text of the full script continues in the same line outside of the frame.

[Image 12]\ [Image 13]

Final results:

大阪弁	Demon slayer	Spirited Away	Your Name
1. めっちゃ
2. なんでやねん
3. まけて
4. ほな
5. おおきに
6. アカン
7. ホンマ
8. アホ			1
9. はよ		2	3
10. おもろい
11. おもんない
12. ちゃう
13. いてる
14. おる		2	1
15. あんた	3	8	11
Occurence of Osaka-ben per term and per movie.

IV. Results and conclusion of our research

The results of our quantitative research were meager to say the least. From the selected fifteen Osaka-ben terms only a few recurred in each of the movies and when they occurred they were of very limited quantity. Thus we can conclude that usage of Osaka dialect in popular anime movies is not frequent. In other words our hypothesis was wrong. That is because if there is little to no presence of dialect in such popular movies it can obviously also not influence the language usage of the people watching.

On top of this are the dialect terms that do occur in the movie scripts also used in general Japanese. One example of this is あんた which means "you" and comes from the word あなた which holds the same meaning. Considering the speech pattern of contraction or abbreviation is quite popular in all language varieties, (general spoken language, dialect, wakamono kotoba etc.) it is hard to distinguish which came first, the dialect or the general Japanese, seeing as regional speech evolved into a standard language which in turn only emphasized the regional dialects even more. It is most likely a "the chicken or the egg" story. This leaves us with the uncertainty whether the dialect we found was actually significant or just a manifestation of spoken language. This same uncertainty does leave space for further research on the topic, because this one research does not give conclusive results about the link between dialect and general spoken Japanese, and to a further extent wakamono kotoba.

Bibliography

Allen, J. (2023, maart 14). How Popular Is Anime In Japan, Really? Unseen Japan. [https://unseenjapan.com/anime-viewership-popularity-japan/]{.underline}

An Introduction to Osaka's Unique and Cherished Dialect. (z.d.).[https://osaka-info.jp/en/osaka/basic/osaka-dialect/]{.underline}

Borisova, A. (2018). The Influence of the Social Networks and Messengers on the Youth Language of the Japanese. ISSN: 2186-5906 -- The Asian Conference on Media, Communication & Film 2018: Official Conference Proceedings.

[https://papers.iafor.org/submission41776/]{.underline}

Danendra, M. D. (2021). The Formation of Japanese Wakamono Kotoba and Indonesian Bahasa Gaul. 430-435. [https://doi.org/10.2991/assehr.k.211119.067]{.underline}

Digitalnagasaki. (2020). Web動画：Voyant-toolsで簡単テキスト分析：コロナウイルス感染症対策本部の会議資料をみてみよう. digitalnagasakiのブログ. [https://digitalnagasaki.hatenablog.com/entry/2020/03/26/142953]{.underline}

Digitalnagasaki. (2016). 簡易テクスト分析にVoyant-Toolsもいかがでしょうか？. digitalnagasakiのブログ.

[https://digitalnagasaki.hatenablog.com/entry/2016/07/30/040123]{.underline}

Hashi. (2011, september 19). The Kana, They Are A-Changin'. Tofugu. [https://www.tofugu.com/japanese/kana-changes-in-history/]{.underline}

Japanese language | Origin, Family, Alphabets, History, Grammar, & Writing | Britannica. (2023, maart 28). [https://www.britannica.com/topic/Japanese-language]{.underline}

Kansai Dialect---Osaka, Japan. (z.d.). Geraadpleegd 9 mei 2023, van [http://osakatraveldestination.weebly.com/kansai-dialect.html]{.underline}

Kansai Regional Dialect---TV tropes. (z.d.). [https://tvtropes.org/pmwiki/pmwiki.php/Main/KansaiRegionalAccent]{.underline}

Yuki Hattori (2020) Regional Variation on Loanword Adaptation in Japanese [https://researchrepository.wvu.edu/cgi/viewcontent.cgi?article=8626&context=etd]{.underline}

Kuwahara, S. (2012). The development of small islands in Japan: An historical perspective. Journal of Marine and Island Cultures, 1(1), 38-45. [https://doi.org/10.1016/j.imic.2012.04.004]{.underline}

List of highest-grossing Japanese films. (z.d.). Wikipedia. [https://en.wikipedia.org/wiki/List_of_highest-grossing_Japanese_films]{.underline}

MATCHA. (z.d.). 大阪に来るならめっちゃ大事。押さえておきたい6つの大阪弁. MATCHA - 訪日外国人観光客向けWebマガジン. Geraadpleegd 16 mei 2023,

[https://matcha-jp.com/jp/175]{.underline}

Matsumoto, K., Akita, K., Keranmu, X., Yoshida, M., & Kita, K. (2014). Extraction Japanese Slang from Weblog Data based on Script Type and Stroke Count. Procedia Computer Science, 35, 464-473. [https://doi.org/10.1016/j.procs.2014.08.127]{.underline}

Stopwords---Voyant Tools Help. (z.d.). Geraadpleegd 9 mei 2023, [https://voyant-tools.org/docs/#!/guide/stopwords]{.underline}

Sudipa, M. H. D., & Meilantari, N. L. G. (2022). Wakamono Kotoba in "Tokyo Revengers" by Ken Wakui: A Study of Morphology and Semantics. JAPANEDU: Jurnal Pendidikan Dan Pengajaran Bahasa Jepang, 7(1), Article 1.

[https://doi.org/10.17509/japanedu.v7i1.38996]{.underline}

The History of Japanese Languages. (2022, juni 18). JapanesePod101.Com Blog. [https://www.japanesepod101.com/blog/2022/06/18/the-history-of-japanese-languages/]{.underline}

Using Voyant Tools with Historical Japanese Texts. (2021, juni 18). The Digital Orientalist. [https://digitalorientalist.com/2021/06/18/using-voyant-tools-with-historical-japanese-texts/]{.underline}

What's the major difference(s) between the spoken Japanese dialect of Kansai area vs, metro Tokyo? (2015). Quara. [https://www.quora.com/Whats-the-major-difference-s-between-the-spoken-Japanese-dialect-of-Kansai-area-vs-metro-Tokyo]{.underline}

よく使う大阪弁一覧｜よく分かる関西弁講座. (2017, augustus 29). よく分かる関西弁講座. [https://kansaiben.com/2017/08/29/%e3%82%88%e3%81%8f%e4%bd%bf%e3%81%86%e5%a4%a7%e9%98%aa%e5%bc%81%e4%b8%80%e8%a6%a7/]{.underline}

Demon Slayer: Kimetsu no Yaiba -- The Movie: Mugen Train (JPN)---Google Docs. (z.d.). Geraadpleegd 22 mei 2023[, https://docs.google.com/document/d/1TQFvn15_3sf3-0mAThgoGC4gCLOujb_yu4wLPK74jVY/edit]{.underline}

Princess Mononoke Movie Script. (z.d.). Geraadpleegd 22 mei 2023, [https://www.scripts.com/script.php?id=princess_mononoke_13983&translate=ja]{.underline}

Spirited Away (JPN)---Google Docs. (z.d.). Geraadpleegd 22 mei 2023, [https://docs.google.com/document/d/1flpx8BB0Ii9jFwaPuafdqEKhcAz0Xrqa5aGv860FJDg/edit]{.underline}

'Your Name.' Dialogue 1 | Mizuki Cantabile. (z.d.). Geraadpleegd 22 mei 2023, [https://ameblo.jp/u-juri/entry-12200390765.html]{.underline}

Excel functions (alphabetical) - Microsoft Support, (z.d.). Geraadpleegd 22 mei 2023,

[https://support.microsoft.com/en-us/office/excel-functions-alphabetical-b3944572-255d-4efb-bb96-c6d90033e188#bm3]{.underline}